PEP 705 ReadOnly TypedDict inheritance (E0038): conformance 136→137, false positives 126→120, mutation-hardened + de-flaked nvim CI gate#77
Open
MelbourneDeveloper wants to merge 3 commits into
Conversation
…056/E0093) Flip typeddicts_readonly_inheritance.py (conformance 136->137 = 93.84%, total false positives 126->120). Implements [CHKARCH-DIAG-TYPEDDICT-READONLY-INHERITANCE]. Transitive TypedDict recognition is the foundation: ClassInfo.is_typed_dict is only true for classes that name TypedDict *directly*, so transitive subclasses (class Album(NamedDict): ...) were invisible to every TypedDict rule -- causing both missed diagnostics and the E0014 dict-literal false positives across the read-only suite. New shared resolver helpers (scope/typeddict_meta.rs, visitor/typeddict_schema.rs) compute each TypedDict's effective merged schema (own + inherited fields, most-derived declaration winning, carrying the field's ReadOnly qualifier and required-ness). - E0038: redeclaration-legality matrix -- writable->ReadOnly forbidden, required->not-required forbidden, writable value types invariant, ReadOnly value types may narrow to a subtype, multiple-inheritance conflicts. The decision functions are pure and mutation-tested (cargo-mutants: every viable mutant killed). - E0014: recognise transitive subclasses so dict-literal assignments are skipped (field-level checking belongs to E0093) -- clears the FP cluster. - E0056: flag writes to inherited ReadOnly fields not redeclared writable. - E0093: wrong value type / missing required key against the merged schema, including plain reassignment of an already-typed variable. - extra_items (PEP 728): del/method-call checks now honour extra_items so transitive subclasses of extra_items TypedDicts are not false-positived. Dedup: shared strip_typeddict_qualifiers and is_transitive_typeddict replace the duplicated strip_td_wrappers / is_typed_dict_class. Benchmarks: five single-rule stress fixtures (e0038 new + e0056/e0093/e0050/e0036). Ratchet: FP ceiling 126->120. Coverage thresholds met (checker 94%, resolver 96%).
PlenaryBustedDirectory's parent nvim can exit non-zero on teardown — a lingering LSP child process or async handle reaped late under `make ci`'s parallel `-j3` load — even when every test passed. Gating on that exit code produced flaky `make ci` failures (verified: the suite fails the exit-code check under -j3 yet passes cleanly in isolation, 16/16 files, 0 failed/0 errored, exit 0). Replace the exit-code gate with assert_plenary_pass (scripts/common.sh): the run passes iff every spec file started AND emitted a summary, with zero failures, zero errors, and no Lua traceback. This is strictly stronger than trusting the process exit — it also catches a run that silently executed no tests, which the old gate would have passed. The nvim exit code is logged for diagnostics but is no longer authoritative. Spec: [LSPTEST-EDITOR-SPECIFIC-INTEGRATION-NEOVIM-E2E-GATE].
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TLDR
Implements PEP 705
ReadOnlyTypedDictinheritance — flippingtypeddicts_readonly_inheritance.pyto PASS (conformance 136→137 = 93.84%) while dropping total false positives 126→120; also adds clean profiler/debugger demo scripts underexamples/and de-flakes the Neovim e2e CI gate.What Was Added?
crates/basilisk-resolver/src/scope/typeddict_meta.rs— shared, cross-crateTypedDictmembership primitives:is_transitive_typeddict,has_extra_items_transitive,transitive_typeddict_names,strip_typeddict_qualifiers,class_by_name. These exist becauseClassInfo::is_typed_dictis onlytruefor classes that nameTypedDictdirectly, so a subclass (class Album(NamedDict): ...) was invisible to everyTypedDictrule — the single root cause of both the missed diagnostics and the E0014 false positives.crates/basilisk-resolver/src/visitor/typeddict_schema.rs—effective_fields, which merges aTypedDict's own + inherited fields (most-derived declaration wins) carrying each field'sReadOnlyqualifier.crates/basilisk-checker/src/rules/e0038.rs) — new pure decision functions:parse_field_qualifiers,redeclaration_violation,value_type_incompatible,type_head,is_invariant_container,bases_conflict.benchmarks/run.sh):e0038_typeddict_readonly_inheritance.py(the new rule), pluse0056,e0093,e0050,e0036— each ~2000 lines and verified to fire its rule 840–4000×.crates/basilisk-checker/tests/mutation_kill_tests.rs.examples/cpu_demo.py,examples/memory_demo.py,examples/README.md) — clean, fully-typed scripts meant to be run (F5) rather than statically checked:cpu_demo.pyproduces a lopsided CPU flame chart (hotis_prime/ deepfib_recursive/ asleepcold_io);memory_demo.pyexercises a sustained leak, a transient spike, and a__del__reference cycle for the memory profiler; the README catalogues every example with its expected diagnostics.assert_plenary_pass(scripts/common.sh) — a result-based gate for the Neovim e2e suite (see "Changed" below), plus spec section[LSPTEST-EDITOR-SPECIFIC-INTEGRATION-NEOVIM-E2E-GATE]indocs/specs/LSP-TEST-INTEGRATION-SPEC.md.[CHKARCH-DIAG-TYPEDDICT-READONLY-INHERITANCE]indocs/specs/CHECKER-ARCHITECTURE-SPEC.md.What Was Changed or Deleted?
ReadOnly; a required item may not be redeclared not-required; writable value types are invariant whileReadOnlyvalue types may narrow to a subtype (same invariant container —list/dict/set— with different args is rejected, a different container head is allowed); multiple inheritance with conflicting core type / required-ness / read-only-ness is rejected. Field-override diagnostics now point at the field, not the class name.e0014/mod.rs) recognises transitiveTypedDictsubclasses so it skips their dict-literal assignments (field-level checking belongs to E0093) — this clears the FP cluster across the read-only suite.final_readonly.rs,typeddict.rs,typeddict_ext.rs,core.rs) operate on the effective merged schema: inherited-ReadOnlywrites are flagged, and wrong-type / missing-required checks run against inherited fields including plain reassignment.del td[k]and the mutating dict methods now honourextra_items(PEP 728).strip_td_wrappers/try_strip_wrapper(resolver) andis_typed_dict_class(checker), now sharingstrip_typeddict_qualifiers/is_transitive_typeddict.scripts/test-nvim.sh): thePlenaryBustedDirectoryrun is now gated on parsed results rather than the nvim process exit code. Undermake ci's parallel-j3load the PlenaryBustedDirectory parent can exit non-zero on teardown (a late-reaped LSP child / async handle) even when every test passed — a flaky false failure.assert_plenary_passinstead requires that every spec file started AND emitted a summary with zero failures, zero errors, and no Lua traceback — strictly stronger than the exit code (it also catches a run that silently executed no tests, which the old gate passed).coverage-thresholds.jsonFP ceiling 126→120 (decreases monotonically);conformance_status.csvregenerated (137/146).How Do The Automated Tests Prove It Works?
conformance_tests::conformance_score):typeddicts_readonly_inheritance.pynow PASS (caught 11, missed 0, FP 0); scorecard reports137/146andFalse+: 120; both gates printConformance gate: 93% … PASSandFP gate: 120 <= 120 ceiling — PASS.#[mutation_safe(rule = "e0038", …)]tests (mutant_e0038_parse_readonly,…_required_relaxing,…_writable_invariant,…_invariant_container,…_type_head,…_field_override,…_bases_conflict,…_single_base_no_conflict) each assert both a violating redeclaration that MUST fire and a legal one that MUST NOT — including a 3-base case that kills thelen() < 2boundary mutant.scripts/test-rust.sh).e0038_tests,e0056,e0093integration suites and all 125checker_testspass unchanged.make ciis green end-to-end (exit 0): lint + Rust tests/coverage + VSIX (328 passing, 87% ≥ 86%) + Neovim e2e (16/16 files ran, 16 summaries, 0 failed, 0 errored, coverage 44% ≥ 30%) + build. The Neovim gate itself was verified against five inputs — it passes a clean run yet fails on an injected test failure, an errored test, a missing-summary (truncated) run, and a Lua traceback.Spec / Doc Changes
docs/specs/CHECKER-ARCHITECTURE-SPEC.md: new#### ReadOnly TypedDict inheritance {#CHKARCH-DIAG-TYPEDDICT-READONLY-INHERITANCE}section; E0038/resolver code and the mutation tests reference this spec ID.docs/plans/CHECKER-PEP-CONFORMANCE-PLAN.md: markstypeddicts_readonly_inheritance.pyDONE; score 137/146.CLAUDE.md: codifies the monotonic conformance / FP / benchmark policy (pre-existing staged guideline edit carried in this branch).Breaking Changes
TypedDictredeclarations and newexamples/scripts; no public API removed (resolver gained exports only).