Skip to content

test(coverage): enforce ≥80% coverage + README badge — 86% baseline#160

Merged
cipher813 merged 1 commit into
mainfrom
test/coverage-to-80-plus-badge
May 22, 2026
Merged

test(coverage): enforce ≥80% coverage + README badge — 86% baseline#160
cipher813 merged 1 commit into
mainfrom
test/coverage-to-80-plus-badge

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Closes the final leg of Brian's "test infrastructure before soak" plan: enforce ≥80% test coverage as a CI gate + add a coverage badge to the README. Current measured coverage: 86.43%.

Composes with PRs #158 (Python integration canary) and #159 ([server]-extras CI + layer3 --exercise-all-tools) as the third leg of the test-trio addressing the 2026-05-22 memory_check_contradictions failure class.

What ships

pyproject.toml coverage config

New [tool.coverage.run] + [tool.coverage.report] sections with fail_under = 80. Excluded modules are documented inline:

Module Why excluded
dashboard/* Streamlit UI; tested by running the app, not by pytest
__main__.py 4-line entry-point shim, no logic
upgrade.py Real Fly+AWS interactions; tested by Layer-3 ritual
downgrade.py Same
llm.py Deprecated optional-LLM path. Deployed product is LLM-free by design (2026-05-22). NLI replaced the only production use in mnemon.contradiction.

.github/workflows/ci.yml coverage gate

pytest --cov runs on every CI build. Coverage drop below 80% fails the build.

README badge

[![Coverage](https://img.shields.io/badge/coverage-86%25-brightgreen.svg)]() — static, shields.io, matches the existing pattern (Status / Python / License / MCP). Manually updated on each release; no SaaS / codecov dep. Composes with mnemon's broader "minimal external deps" posture (no LLM, no auth vendor, no observability vendor — and no coverage vendor).

4 new tests/test_nli.py tests

Push nli.py from 72% → ~85% coverage. Cover error paths I personally introduced in PR #157 + #158:

  1. _ensure_loaded raises NLIUnavailableError on HuggingFace download failure (connection error, 404, etc.)
  2. _ensure_loaded raises on unexpected label set — a model with different output classes fails fast, not silently mis-classifies downstream
  3. prewarm() swallows unavailability cleanly (acceptable-secondary-observability category per feedback_no_silent_fails)
  4. classify_pair softmax + ONNX input-building path exercised with a stubbed session (lines 164-189 of nli.py)

Coverage breakdown (top 10 by lowest coverage, all above gate)

Module Coverage
cli.py 61% — interactive CLI dispatcher; many branches require real terminal
server_remote.py 63% — OAuth + transport flows, partially covered by test_server_remote.py
context_surfacing.py 71% — hook with many failure-mode branches
nli.py ~85% (was 72%, +4 new tests)
persistent_sessions.py 77%
uninstall.py 81%
mirror.py 84%
search.py 85%
doctor.py 85%
auth.py 87%

These are P3 ROADMAP items if we ever want to push individual modules higher; the project-level 80% gate is the operational target.

Test plan

  • pytest --cov reports 86.43% (above the 80% gate)
  • Full suite 850 → 855 passing
  • Coverage config validated locally
  • Post-merge: CI workflow validates the gate runs

What's next

Once #160 merges, the test substrate is at the standard Brian set ("known bugs fixed + verified before soak"). Then:

  1. Bump 0.7.0rc2 → 0.7.0rc3 (rolling in the test-trio + coverage gate as a soak-substrate version)
  2. Republish PyPI
  3. Fly redeploy via mnemon upgrade web (operator-driven)
  4. Optional: run scripts/promote_stable.sh layer3 --exercise-all-tools against a test Fly app for the first time (~15 min, validates the new Fly-level probe end-to-end)
  5. Then operator activates Phase 1 standing tier per the soak workflow (demote chore: promote 0.6.0rc18 → 0.6.0 stable #131 + #2402 → flip MNEMON_STANDING_TIER_ENABLED=true → fresh Claude Code → ≥1 week soak)

🤖 Generated with Claude Code

Closes Brian's "coverage to ≥80% + add a badge" leg of the test-trio
plan (PRs #158/#159/#160). Suite at 86.43% with the new omits +
4 added nli.py error-path tests.

pyproject.toml gains [tool.coverage.run] + [tool.coverage.report]
config with fail_under=80. ci.yml runs `pytest --cov` so a PR that
drops coverage below the gate fails the build. Excluded modules
are documented inline and all under-testable-by-design:

  - dashboard/*       Streamlit UI; tested by running the app
  - __main__.py       4-line entry-point shim
  - upgrade.py        Real Fly+AWS interactions; tested by Layer-3
  - downgrade.py      Same
  - llm.py            Deprecated optional-LLM path; the deployed
                      product is LLM-free by design (2026-05-22).
                      NLI replaced the only production use of this
                      module in mnemon.contradiction.

README badge: shields.io static `coverage-86%-brightgreen`. Matches
existing static-badge pattern (Status, Python, License, MCP). Manual
update on each release; no SaaS / codecov dep.

4 new nli.py tests (suite 850 → 855):
  - _ensure_loaded raises NLIUnavailableError on HF download failure
  - _ensure_loaded raises on unexpected label set (model with
    different output classes fails fast, not silently mis-classifies)
  - prewarm() swallows unavailability per acceptable-secondary-
    observability category
  - classify_pair softmax + input-building exercised with stubbed
    session (lines 164-189 of nli.py)

Composes with the test-trio:
  PR #158 — Python-level integration canary (every tool round-trip)
  PR #159 — [server]-extras CI matrix + layer3 --exercise-all-tools
  PR #160 — this PR: coverage gate + badge

Together they catch the 2026-05-22 memory_check_contradictions
failure class at PR review time (canary + extras matrix) and ensure
overall coverage doesn't regress as the project grows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cipher813 cipher813 merged commit f3bdf8d into main May 22, 2026
10 checks passed
@cipher813 cipher813 deleted the test/coverage-to-80-plus-badge branch May 22, 2026 17:30
cipher813 added a commit that referenced this pull request May 22, 2026
Rolls the test-trio into a soak-substrate release:
- #158 every-MCP-tool integration canary
- #159 [server]-extras CI matrix + layer3 --exercise-all-tools
- #160 coverage gate at 80% + README badge (86% baseline)

Composes with the prior 0.7.0rc2 features (NLI-based contradiction
detection, dry_run flag). After PyPI publish + Fly redeploy, this is
the version operators should run the Phase 1 standing-tier soak on
per Brian's "all known bugs fixed before soak" standard.

Suite 855 passing. mnemon --version returns 0.7.0rc3.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant