test: cover flagged edge cases + fix snapshot persistence-health bug#330
Conversation
Source-vs-tests audit of the four testing tiers. Adds coverage at the lowest honest tier (all tier 1) for branches with no covering test, and fixes one real bug found along the way. Bug fix: - storage_service.save_snapshot: a non-serializable snapshot (TypeError from json.dumps) logged the error but left db_healthy=True, so the #131 "DB write failing" badge never fired while snapshots silently stopped persisting. Route it through _db_error like every other write path. Tests added (tier 1): - pytest: unserializable snapshot flips db_healthy (regression guard for above). - pithead shell: firewall install-failure rollback (#270); wallet subaddress / integrated hard-fail (#250); remote-mode-without-host and non-/24 subnet guards (#180); ensure_owner conditional + whole-tree recursive chown (#255). - frontend: ProxyTotals high-reject styling; per-worker api/reject badges + XvB/Unknown pool badges; the #278/#313 Tari-✔ invariant (never on active-but-dead); Gauge done-vs-syncing. Docs: - test-inventory regenerated (764 -> 770 cases); testing-strategy Known-gaps gains a "Coverage-audit follow-ups" section listing the tier-3/4 backstops still needed (real-kernel firewall rollback, real mixed-ownership ensure_owner, real-container monerod failover, non-blocking-Tari ignore, busy/reorg, double-outage, doctor NTP check, ENOSPC, Tor-down, insecure+main row). make lint + make test green; dashboard 94.73% cov; patch coverage 100%. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e_owner The two divergence risks the audit flagged — paths PR #330 proves only with stubs — now have live counterparts, folded into the existing opt-in phases (no new flags), following the --auth-fail-closed "live counterpart to a tier-1 assertion" pattern. - --fault-injection: fault_firewall_rollback shadows iptables with a wrapper that fails every `-I`, re-runs apply_tor_egress_firewall, and asserts the box ends fail-closed (no pithead-tor-egress rule left half-installed), then reinstates the real firewall. Belt-and-braces reinstates unconditionally so a mid-phase abort can't leave clearnet egress open. Proves the real-kernel strip the tier-1 stub can't (#270). - --lifecycle: plants a root-owned file under the dashboard data dir and asserts the pool-flip apply (ensure_directories -> ensure_owner) chowns it to uid 1000 — the #255 "scan contents, not just the dir" regression, which needs root to set up so it can't live at tier 1. Both are DESTRUCTIVE-then-restored, local-box only, and run at the release gate (not CI). Help text, testing-strategy Known-gaps, and the test inventory updated. Verified: shellcheck + shfmt clean, bash -n parses, integration self-test 97/0, make lint green, inventory drift-check green. NOT executed against a real box — that happens at the release gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Follow-up commit
Verified: shellcheck + shfmt clean, |
…iptables Validated the two tier-4 backstops live on a real box (gouda). The ensure_owner migration passed as written. The firewall-rollback check did NOT exercise the rollback: apply_tor_egress_firewall calls `sudo iptables -I`, and sudo's secure_path ignores a PATH-shadowed iptables, so the real insert succeeded and the rollback branch never ran (the rc-0 assertion passed for the wrong reason — a false green). Fix: shadow `sudo` instead of `iptables`. sudo is still resolved via PATH, so a wrapper that fails an `iptables … -I …` insert and execs real sudo for everything else (remove's -D, -N, iptables-save) makes the insert fail exactly as a real mid-insert error would. Re-validated live: with the real -I failing, the partial ruleset is rolled back (0 pithead-tor-egress rules), then reinstated (7). Box restored byte-identical to its pre-test iptables snapshot. shellcheck + shfmt clean, bash -n parses, self-test 97/0, inventory drift green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Live-validated on a real box (commit
|
What
A source-vs-tests audit across Pithead's four testing tiers. It adds coverage — at the lowest honest tier (all tier 1) — for branches, guards, and render paths that had none, and fixes one real bug found along the way. No production behaviour changes except the bug fix.
The bug (found while auditing)
storage_service.save_snapshot'sTypeErrorbranch (a snapshotjson.dumpscan't serialize) calledlogger.errordirectly instead of_db_errorlike every other write path — sodb_healthystayedTrueand the #131 "DB write failing" badge never fired while snapshots silently stopped persisting. One-line root-cause fix + a regression test assertingis_db_healthy()flips toFalse.Tests added (tier 1)
db_healthy(guards the fix above)iptables -Iinstall-failure → warn + roll back, not half-open (#270)monero.mode=remotewith no host → abort; non-/24network.subnet→ abort (#180)ensure_ownerconditional chown + whole-tree recursive scan, no-maxdepth(#255)ProxyTotalsreddens rejected only onreject_level=highapi_ok=false+reject_flagbadges; XvB/UnknownPoolBadgevariantsGaugedone→✔ vs syncing→percentEach test fails if the logic is inverted (e.g. the Tari test asserts
check-inlineis absent for active-but-dead; the snapshot test assertsdb_healthy is False).Docs
docs/test-inventory.mdregenerated: 764 → 770 enumerated cases (drift check green).docs/testing-strategy.mdKnown-gaps ledger gains a Coverage-audit follow-ups section (below), so the gaps stay tracked.Additional gaps found — not covered here (need Docker / real box → tier 3/4)
These are the real completeness risks the audit surfaced. None touch the code changed in this PR, so per "test once at the lowest honest tier" they're follow-ups, not part of this change. The first two are divergence backstops for paths this PR proves only with stubs:
iptables(control flow only). Whether a real kernel strips a partially insertedDOCKER-USERrule and ends fail-closed is never exercised with an induced mid-insert failure → needs a--fault-injectioncase.ensure_ownerreal mixed-ownership tree. The Run containers as a non-root user (drop root inside the container) #255 scenario (user-owned dir, root-owned contents, onechown -R) can only be built as root → needs a--lifecyclestep.pithead doctoron a real box — only its exit code is unit-tested; the NTP/clock-drift check is never fault-injected.dashboard.secure=falseonly pairs withpool=nano, so insecure+main has no row.Verification (all local)
make lint→ all checks passedmake test→ dashboard 641 passed / 94.73% cov (gate 80%), pithead shell 432/0, selftest 97/0, compose ✓, fakes 12 ✓make test-patch-coverage→ 100% on the changed line (gate 90%)node --test) → 66 pass / 0 failmake test-inventory-check→ up to dateFixtures use synthetic values only (
4AAA…/8AAA…wallets, RFC-1918 IPs) — respects.gitleaks.toml.🤖 Generated with Claude Code