test(coverage): enforce ≥80% coverage + README badge — 86% baseline#160
Merged
Conversation
Closes Brian's "coverage to ≥80% + add a badge" leg of the test-trio plan (PRs #158/#159/#160). Suite at 86.43% with the new omits + 4 added nli.py error-path tests. pyproject.toml gains [tool.coverage.run] + [tool.coverage.report] config with fail_under=80. ci.yml runs `pytest --cov` so a PR that drops coverage below the gate fails the build. Excluded modules are documented inline and all under-testable-by-design: - dashboard/* Streamlit UI; tested by running the app - __main__.py 4-line entry-point shim - upgrade.py Real Fly+AWS interactions; tested by Layer-3 - downgrade.py Same - llm.py Deprecated optional-LLM path; the deployed product is LLM-free by design (2026-05-22). NLI replaced the only production use of this module in mnemon.contradiction. README badge: shields.io static `coverage-86%-brightgreen`. Matches existing static-badge pattern (Status, Python, License, MCP). Manual update on each release; no SaaS / codecov dep. 4 new nli.py tests (suite 850 → 855): - _ensure_loaded raises NLIUnavailableError on HF download failure - _ensure_loaded raises on unexpected label set (model with different output classes fails fast, not silently mis-classifies) - prewarm() swallows unavailability per acceptable-secondary- observability category - classify_pair softmax + input-building exercised with stubbed session (lines 164-189 of nli.py) Composes with the test-trio: PR #158 — Python-level integration canary (every tool round-trip) PR #159 — [server]-extras CI matrix + layer3 --exercise-all-tools PR #160 — this PR: coverage gate + badge Together they catch the 2026-05-22 memory_check_contradictions failure class at PR review time (canary + extras matrix) and ensure overall coverage doesn't regress as the project grows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
cipher813
added a commit
that referenced
this pull request
May 22, 2026
Rolls the test-trio into a soak-substrate release: - #158 every-MCP-tool integration canary - #159 [server]-extras CI matrix + layer3 --exercise-all-tools - #160 coverage gate at 80% + README badge (86% baseline) Composes with the prior 0.7.0rc2 features (NLI-based contradiction detection, dry_run flag). After PyPI publish + Fly redeploy, this is the version operators should run the Phase 1 standing-tier soak on per Brian's "all known bugs fixed before soak" standard. Suite 855 passing. mnemon --version returns 0.7.0rc3. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the final leg of Brian's "test infrastructure before soak" plan: enforce ≥80% test coverage as a CI gate + add a coverage badge to the README. Current measured coverage: 86.43%.
Composes with PRs #158 (Python integration canary) and #159 ([server]-extras CI + layer3 --exercise-all-tools) as the third leg of the test-trio addressing the 2026-05-22
memory_check_contradictionsfailure class.What ships
pyproject.tomlcoverage configNew
[tool.coverage.run]+[tool.coverage.report]sections withfail_under = 80. Excluded modules are documented inline:dashboard/*__main__.pyupgrade.pydowngrade.pyllm.pymnemon.contradiction..github/workflows/ci.ymlcoverage gatepytest --covruns on every CI build. Coverage drop below 80% fails the build.README badge
[]()— static, shields.io, matches the existing pattern (Status / Python / License / MCP). Manually updated on each release; no SaaS / codecov dep. Composes with mnemon's broader "minimal external deps" posture (no LLM, no auth vendor, no observability vendor — and no coverage vendor).4 new
tests/test_nli.pytestsPush
nli.pyfrom 72% → ~85% coverage. Cover error paths I personally introduced in PR #157 + #158:_ensure_loadedraisesNLIUnavailableErroron HuggingFace download failure (connection error, 404, etc.)_ensure_loadedraises on unexpected label set — a model with different output classes fails fast, not silently mis-classifies downstreamprewarm()swallows unavailability cleanly (acceptable-secondary-observability category perfeedback_no_silent_fails)classify_pairsoftmax + ONNX input-building path exercised with a stubbed session (lines 164-189 ofnli.py)Coverage breakdown (top 10 by lowest coverage, all above gate)
cli.pyserver_remote.pytest_server_remote.pycontext_surfacing.pynli.pypersistent_sessions.pyuninstall.pymirror.pysearch.pydoctor.pyauth.pyThese are P3 ROADMAP items if we ever want to push individual modules higher; the project-level 80% gate is the operational target.
Test plan
pytest --covreports 86.43% (above the 80% gate)What's next
Once #160 merges, the test substrate is at the standard Brian set ("known bugs fixed + verified before soak"). Then:
mnemon upgrade web(operator-driven)scripts/promote_stable.sh layer3 --exercise-all-toolsagainst a test Fly app for the first time (~15 min, validates the new Fly-level probe end-to-end)MNEMON_STANDING_TIER_ENABLED=true→ fresh Claude Code → ≥1 week soak)🤖 Generated with Claude Code