Conversation
R3 Stage A per the plan in ~/.claude/plans/ethereal-forging-cookie.md and tracking issue #87. Flips the canonical/legacy direction for 7 class pairs so the software-vocabulary names are now the *real* class definitions and the legacy biology names are one-line aliases. Identity is preserved in both directions (Document is Gene and Gene is Document both evaluate True); only __name__ changes (now reports the canonical). schemas.py: ChromatinState -> LifecycleTier (IntEnum; enum values stay) PromoterTags -> DocumentTags EpigeneticMarkers -> DocumentSignals Gene -> Document GeneAttribution -> DocumentAttribution genome.py: Genome -> KnowledgeStore ribosome.py: Ribosome -> Compressor aliases.py: Inverts the import direction. Pre-R3: `Gene as Document` (Gene was real). Post-R3: imports `Document` directly (Document is real) plus `_RENAME_LOG` provenance dict. Both import surfaces continue to resolve to the same class object. Out of scope per the R3 plan (stays untouched in Stage A): - Pydantic field names (gene_id, promoter, epigenetics, chromatin, codons, harmonic_links, gene_attribution) — SQL contract - SQL table/column names - Prometheus metric/label names - ChromatinState enum value strings (OPEN/EUCHROMATIN/ HETEROCHROMATIN) — serialized as TEXT, queryable - MCP tool names (R4 territory) - agent_prompt.py contract field names - Method names (pack/splice/replicate/re_rank/upsert_gene/ query_genes/_express) — Stage C - Module file names — Stage B - Local variables and parameters — Stage D Verification: - Identity contract: 7/7 pairs `legacy is canonical` evaluate True - All __name__ attributes report canonical name (e.g. Gene.__name__ == 'Document') - Pydantic JSON round-trip OK via both Gene + Document import surfaces; chromatin enum int values preserved (EUCHROMATIN == 1) - tests/test_genome.py + test_retrieval_dimensions.py + test_server.py: 127/127 pass (matches baseline byte-for-byte) - Zero brittle `__name__ ==` reflection patterns in the codebase (verified by grep pre-edit)
R3 Stage B.1 per the plan in ~/.claude/plans/ethereal-forging-cookie.md and issue #87. Renames helix_context/ribosome.py to helix_context/compressor.py so the canonical filename matches the canonical class (`Compressor`, established in R3 Stage A). The old path remains as a back-compat shim that re-exports every module-level name (public + single- underscore private) from the new location. Files: - helix_context/ribosome.py -> helix_context/compressor.py (git rename) - helix_context/ribosome.py (new file — shim) The shim uses a dir()-walk loop to re-export every non-dunder name, which covers the historical private-name leakage that training and test files rely on: - _parse_json, _EXPRESS_SYSTEM, _splice_system, _PACK_SYSTEM, _KV_EXTRACT_SYSTEM, _REPLICATE_SYSTEM - Plus all public classes: Compressor, Ribosome (alias), OllamaBackend, ClaudeBackend, DeBERTaRibosome, etc. Verification: - Direct import works: from helix_context.compressor import Compressor - Legacy import works: from helix_context.ribosome import Ribosome - Package-level import works: from helix_context import Ribosome - All four import paths resolve to the same class object - Private name access works: from helix_context.ribosome import _parse_json - tests/test_ribosome.py + test_genome.py + test_retrieval_dimensions.py: 72/72 pass in 5s
R3 Stage B.2 per ~/.claude/plans/ethereal-forging-cookie.md and #87. Renames the largest module in the codebase (4588 lines) so the canonical filename matches the canonical class (``KnowledgeStore``, established in R3 Stage A). The old path stays as a shim that re-exports every module-level name (public + private). Files: - helix_context/genome.py -> helix_context/knowledge_store.py (rename) - helix_context/genome.py (new file — shim) SQL contracts unchanged: the on-disk SQLite schema still references tables/columns as ``genes``, ``gene_id``, ``gene_attribution``, ``harmonic_links``, ``chromatin``, ``promoter``, ``epigenetics``, ``codons``. Only the Python module filename and class identity moved. Shim covers private-name leakage for ``_kv_keys_from_list`` used by scripts/backfill_path_key_index.py. Verification: - 3-path identity: Genome is KnowledgeStore is helix_context.Genome - path_tokens / file_tokens / _kv_keys_from_list all reachable - Genome.__name__ == 'KnowledgeStore' (canonical) - tests/test_genome.py + test_retrieval_dimensions.py + test_server.py: 127/127 pass in 2:01
R3 Stage B.3 per the plan in ~/.claude/plans/ethereal-forging-cookie.md and tracking issue #87. Files: - helix_context/codons.py -> helix_context/fragments.py (rename) - helix_context/codons.py (new shim — re-exports everything) Class names (Codon / CodonChunker / CodonEncoder / RawStrand) and the Pydantic field name (Gene.codons / Document.codons) are unchanged — Stage C may rename the helper class identifiers later. Verification: - import surfaces resolve via Codon (codons), Codon (fragments), and helix_context.Codon all to the same class - tests/test_codons.py + test_genome.py: 55/55 pass in 0.4s
R3 Stage B.4 per the plan in ~/.claude/plans/ethereal-forging-cookie.md and tracking issue #87. Files: - helix_context/replication.py -> helix_context/persistence.py (rename) - helix_context/replication.py (new shim) Class identifier (ReplicationManager) is unchanged — Stage C may rename the method-level surface (replicate -> persist). Verification: 3-path identity + tests/test_server.py + test_genome.py 114/114 pass in 2:06.
R3 Stage B.5 per the plan in ~/.claude/plans/ethereal-forging-cookie.md and tracking issue #87. This completes Stage B (5 module renames). Files: - helix_context/hgt.py -> helix_context/cross_store_import.py (rename) - helix_context/hgt.py (new shim) Per docs/ROSETTA.md, "HGT" (horizontal gene transfer) is the legacy biology framing for what's just a cross-store document import operation. Function-level surface (export_genome / import_genome / genome_diff) is unchanged in Stage B; Stage C may rename to export_documents / import_documents / store_diff. Verification: - Imports work via both helix_context.hgt and helix_context.cross_store_import - tests/test_health.py: 10/10 pass (+ 2 xfailed pre-existing)
Owner
Author
Post-Stage-B verification complete ✓Full mock suite ran after Stage B.5 ( Identical result to the post-Stage-A full suite (1933 / 0 failed, 9:07 vs 8:52 wallclock). The 5 module renames + 7 class flips + alias inversion together produced zero test regressions. Breakdown:
Ready for review. |
…icate/re_rank) (#87) R3 Stage C.1 per the plan in ~/.claude/plans/ethereal-forging-cookie.md and issue #87. Renames the 4 main methods on the Compressor class (was Ribosome) to canonical software vocabulary. Legacy method names remain valid as intra-class aliases pointing at the same function objects -- not wrappers, so latency histograms, identity checks, and call counters behave exactly as before. compressor.py: Method renames inside class Compressor: pack -> encode (signature unchanged) re_rank -> rerank (signature unchanged) splice -> trim (signature unchanged) replicate -> persist (signature unchanged) Section header comments updated to match canonical names. End of class body now carries a "Legacy method aliases" block: pack = encode splice = trim replicate = persist re_rank = rerank Internal caller updates (canonical names everywhere we own): context_manager.py: L709: self.ribosome.pack(...) -> self.ribosome.encode(...) L1880: self.ribosome.pack(...) -> self.ribosome.encode(...) L2336: hasattr(self.ribosome, "re_rank") -> hasattr(..., "rerank") L2339: self.ribosome.re_rank(...) -> self.ribosome.rerank(...) deberta_backend.py: L284: self._ollama.pack(...) -> self._ollama.encode(...) L290: self._ollama.replicate(...) -> self._ollama.persist(...) Notes: - The `self.ribosome` attribute name on HelixContextManager is unchanged in Stage C (Stage D variable sweep may rename it). - DeBERTaRibosome.pack() and .replicate() methods themselves stay as-is (separate class; its own method-name story can be addressed later). - The 4 internal helpers in DeBERTaRibosome that call into Compressor via self._ollama have been migrated to canonical names; the public method names (pack/replicate) on the DeBERTaRibosome class itself are preserved. Reflection edge case at context_manager.py:2336 covered: both legacy ("re_rank") and canonical ("rerank") names resolve to the same function object, so the hasattr check works either way; updating the string to "rerank" matches the canonical method name. Verification: - Compressor.pack is Compressor.encode -> True (and 3 other pairs) - tests/test_ribosome + test_deberta_backend + test_genome + test_retrieval_dimensions + test_server: 151/151 pass in 2:34
…ller updates (#87) R3 Stage C.2 per the plan in ~/.claude/plans/ethereal-forging-cookie.md and issue #87. Renames the 5 main methods on KnowledgeStore (was Genome) to canonical software vocabulary, with intra-class aliases keeping every legacy caller working unchanged. knowledge_store.py — method renames inside class KnowledgeStore: upsert_gene -> upsert_doc query_genes -> query_docs query_genes_ann -> query_docs_ann query_genes_dense_recall -> query_docs_dense_recall get_gene -> get_doc Each rename keeps: - method signature unchanged (param names + types stay -- param rename is Stage D territory if we want it) - method body identical to pre-rename End of class body now carries a "Legacy method aliases" block. Each alias is the *same function object* as the canonical method, not a wrapper: upsert_gene = upsert_doc query_genes = query_docs query_genes_ann = query_docs_ann query_genes_dense_recall = query_docs_dense_recall get_gene = get_doc Internal caller migration (canonical names everywhere in helix_context): api.py (2x get_gene -> get_doc) context_manager.py (5x upsert_gene, 1x query_genes, 1x query_genes_ann) context_packet.py (2x query_genes) cross_store_import.py (2x upsert_gene, 1x get_gene) expand.py (2x get_gene) fusion.py (1x docstring) knowledge_store.py (1x upsert_gene, 2x query_genes, 2x dense_recall, 2x get_gene) persistence.py (1x comment) registry.py (1x upsert_gene) server.py (5x get_gene) shard_router.py (2x query_genes) sharding.py (1x query_genes, 1x get_gene) Total: 12 files, +52 / -37 lines. External API parity preserved: - api.py::gene_get(gene_id) still works -- now calls get_doc(gid) internally, but the public API name is unchanged. - Test files (tests/*.py) NOT updated -- they call the legacy names which still resolve via aliases. Stage D may sweep these for consistency. - SQL table/column names (genes, gene_id, gene_attribution, harmonic_links, chromatin, promoter, epigenetics, codons) untouched -- on-disk contract is preserved. Verification: - 5/5 method pairs: legacy method is canonical method (identity) - tests/test_genome + test_retrieval_dimensions + test_server + test_ribosome + test_deberta_backend + test_health: 161 passed, 4 deselected, 2 xfailed in 2:20
…ics, fragments) (#87) R3 Stage C.3 per the plan in ~/.claude/plans/ethereal-forging-cookie.md and issue #87. Completes the Stage C method/helper renames. context_manager.py — method renames inside HelixContextManager: _express -> _retrieve _make_parent_gene_id -> _make_parent_doc_id _upsert_parent_gene -> _upsert_parent_doc cymatics.py — module-level function renames: gene_spectrum -> doc_spectrum _cached_gene_spectrum -> _cached_doc_spectrum (LRU-cached wrapper) cached_gene_spectrum -> cached_doc_spectrum interference_splice -> interference_trim Internal callers in cymatics.py updated to canonical names. fragments.py — staticmethod rename inside CodonEncoder: codon_id -> fragment_id (staticmethod descriptor aliased) Each rename adds the legacy name as a one-line alias pointing at the same function/method object — `is` identity holds, .cache_info() / .cache_clear() on the LRU wrappers still work via either name. Internal caller migrations (canonical name everywhere we own): context_manager.py — self._upsert_parent_gene, self._make_parent_gene_id, self._express (multiple sites) -> canonical equivalents server.py — helix._express -> helix._retrieve cymatics.py — internal calls to cached_gene_spectrum / _cached_gene_spectrum -> canonical names context_manager.py + server.py — imports of cached_gene_spectrum from cymatics module -> cached_doc_spectrum (still re-exported via alias) Test updates (monkey-patch path): - tests/conftest.py — fixture patches both `g.upsert_doc` (canonical, what internal code calls) and `g.upsert_gene` (legacy alias) - tests/test_abstain_tier.py + test_foveated_splice.py — patch both manager._retrieve and manager._express - tests/test_dense_recall.py — patch both query_docs / query_genes and query_docs_dense_recall / query_genes_dense_recall; switch test call site to canonical query_docs_ann - tests/test_genome.py — fixture patches both upsert_doc / upsert_gene - tests/test_server.py — TestDebugIntrospectionEndpoints fixture patches both manager._retrieve and manager._express Why test updates matter: aliases give name-level identity at the *class* level. Instance-attribute monkey-patches only affect the patched name; if internal code uses the canonical name, the patch is bypassed. Fix is patch both names. (Tests calling the methods directly without monkey-patching are unaffected — the alias resolves to the same function.) Reflection edge case (context_manager.py:2336 hasattr) is unaffected — it was already fixed in C.1 to check the canonical name. Verification: - 7/7 alias pairs: legacy is canonical (identity preserved) - tests/test_server::TestDebugIntrospectionEndpoints + test_dense_recall + test_abstain_tier + test_foveated_splice + test_cymatics: 106/106 pass in 1:09
… get_doc (#87) 3 tests in tests/test_api_walk.py were configuring `mgr.genome.get_gene.return_value = ...` (or .side_effect) but after Stage C.2 api.py:gene_get calls the canonical `genome.get_doc()`. On a MagicMock, the unconfigured .get_doc attribute returns a fresh MagicMock instead of the test's expected None / fake gene. Switch the mock configuration to the canonical name (.get_doc). Identity-preserving aliases mean real KnowledgeStore instances still accept both names, but MagicMock attribute lookup is per-name. Tests fixed: - test_gene_get_delegates_to_genome - test_gene_get_returns_none_on_unknown_id - test_neighbors_sorts_by_similarity_desc Also fixes a 4th mock configuration in the same file (line 266) for consistency, even though that test was already passing.
Owner
Author
Stage C verification complete ✓Full mock suite ran after Stage C.3 + test-mock fix ( Identical pass count across all three Stage verifications:
Stage C added 17 method/helper renames + ~30 internal caller migrations + 7 test monkey-patch site updates, with zero net change in test outcomes. What's now in the PR
Identity contract29 identity-preserving aliases total:
No SQL schema change. No Pydantic field-name change. No MCP tool rename. No wire-format break. Still in R3 (separate PRs)
|
mbachaud
added a commit
that referenced
this pull request
May 13, 2026
…spec (#87) R3 Stage E — closes the multi-stage rename effort with up-to-date documentation. docs/ROSETTA.md: Phase-status table at the bottom now reflects reality: R1 -> shipped @ 09d5548 (2026-04-15) R2 -> shipped @ PR #70 87fcb68 (2026-05-12) R3 Stage A -> shipped @ 56fcbed (PR #88, 2026-05-13) R3 Stage B -> shipped @ 460d824..9e7471f (PR #88) R3 Stage C -> shipped @ edc0194..71469ba (PR #88) R3 Stage D -> in progress (this PR #89) R3 Stage E -> in progress (this commit) R4 -> deferred (see #87) Pre-existing entries (out-of-scope list, How-to-use, mapping table) unchanged. docs/superpowers/specs/2026-05-13-rename-r3-symbol-rename-design.md (NEW): Durable design-spec record for R3 mirroring R2's structure (2026-05-11-rename-r2-prose-sweep-design.md). Captures: - Why R3 exists; predecessor specs referenced - Decisions baked in (class-def flip direction, module renames, no MCP slimdown) - Out-of-scope list (SQL, Pydantic fields, Prom metrics, ChromatinState enum values, MCP tool names, agent_prompt contract, LLM prompt strings) - Stage A/B/C/D/E summaries with commit SHAs + table-of-renames - Identity contract: 29 alias pairs across 7 layers - Verification gates (1933/0 across all 3 full-mock runs) - Stage D known-not-fully-completed scope (intentional scope cap) Closes the R3 audit trail. Future contributors can audit the rename by reading this spec + the R2 spec + ROSETTA.md without needing to re-discover the design intent.
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
R3 Stages A and B per the plan in
~/.claude/plans/ethereal-forging-cookie.mdandtracking issue #87. Closes Stage A and Stage B of the 5-stage R3
effort; Stages C / D / E remain.
All canonical software-vocabulary names are now the real class and
module definitions; legacy biology names remain as identity-preserving
aliases declared immediately after each class def, plus shim modules
at every old import path.
What this PR does
Stage A — class-def flip + alias inversion (
56fcbed)For each pair below, the canonical name is now the real
classdefand the legacy name is a one-line module-level alias:
schemas.pyLifecycleTierChromatinStateschemas.pyDocumentTagsPromoterTagsschemas.pyDocumentSignalsEpigeneticMarkersschemas.pyDocumentGeneschemas.pyDocumentAttributionGeneAttributiongenome.pyKnowledgeStoreGenomeribosome.pyCompressorRibosomeIdentity holds in both directions —
Document is GeneandGene is Documentboth evaluate True.__name__reports the canonical(e.g.
Gene.__name__ == 'Document').aliases.pywas inverted toimport the new canonical names from their home modules.
Stage B — module file moves + shim modules (
460d824→9e7471f, 5 commits)helix_context/ribosome.pyhelix_context/compressor.pyhelix_context/genome.pyhelix_context/knowledge_store.pyhelix_context/codons.pyhelix_context/fragments.pyhelix_context/replication.pyhelix_context/persistence.pyhelix_context/hgt.pyhelix_context/cross_store_import.pyEach old path now contains a shim that walks
dir(_new_module)andre-exports every non-dunder attribute. This covers historical private
name leakage (
_parse_json,_EXPRESS_SYSTEM,_splice_system,_kv_keys_from_list) used by training scripts and tests.Out of scope (still in R3, later stages)
pack→encode,splice→trim,replicate→persist,re_rank→rerank,upsert_gene→upsert_doc,query_genes→query_docs,_express→_retrieve,plus cymatics helpers and the reflection-string edge case at
context_manager.py:2336).gene→doc).docs/ROSETTA.mdphase-status refresh + R3 spec stub.Out of scope (still protected, per R2 spec §2)
genes,gene_attribution,harmonic_links;columns
gene_id,promoter,epigenetics,chromatin,codons)ChromatinStateenum value strings (OPEN/EUCHROMATIN/HETEROCHROMATIN) — serialized as TEXT, queryableagent_prompt.py::HELIX_NO_MATCH_FRAGMENTJSON contract fieldsVerification
pytest -m "not live"Pydantic JSON round-trip preserved both ways. Identity contract holds
across all 7 schema/module pairs. All 4 import paths to each class
(legacy module, canonical module,
aliases.py, package-levelhelix_context) resolve to the same class object.How to review
git log --oneline 56fcbed^..9e7471f— 6 commits, one logical perstage / module.
A + Minstead of a single renamebecause git's rename detection breaks when the same path is
modified to a shim in the same commit. Use
git log --follow -M50% <file>for history traversal.review one closely and the others rhyme.
Related
~/.claude/plans/ethereal-forging-cookie.md09d5548docs/superpowers/specs/2026-05-11-rename-r2-prose-sweep-design.md