docs(plans): add jepa_cross_modal_alignment.md (1.2+ research direction) by dennys246 · Pull Request #252 · dennys246/Maxim

dennys246 · 2026-05-15T05:00:00Z

Summary

Plan-only PR. Adds docs/plans/jepa_cross_modal_alignment.md as a 1.2+ research direction, plus a README slot for the 1.1-track triangle (roy_5 / cross_modal_binding / JEPA).

No code changes. The plan stays DRAFT until roy_5_encoder_alignment_disambiguator.md Stage 3 (cradle-arc redesign) ships and produces sufficient paired training data. Stage 0 of the JEPA plan is a ~50 LOC data audit on Roy-5b's existing data; only after that PASSES does any code work begin.

Why this plan exists

Roy-5a-substrate-on (PR #251) surfaced a structural finding the plan it was meant to resolve did not model: SensorEncoder (384-dim) and LinguisticEncoder (768-dim) embed in different-dimensional spaces. Cross-modality cosine is mathematically undefined; any cosine-based cross-modal alignment is structurally impossible without a learned projection layer.

Three previously-disjoint sketches were each working around this gap without acknowledging each other:

Plan	What it sketched	What it actually needs
`cross_modal_substrate_binding.md` Stage 4a	Hebbian binding on raw cosine between EC nodes	A learned projection — Hebbian on different-dim vectors is impossible. This is what cancelled it via Roy-4.
`roy_5_encoder_alignment_disambiguator.md` Stage 4b	"Encoder replacement to 1.2+ research direction"	A bio-defensible learned projection that doesn't replace either encoder. JEPA is that.
`grounded_language_acquisition.md` Phase 2	"Small MLP or tiny RNN" binding token sequences → EC node IDs	A one-modality JEPA where one head is a token-embedding lookup. Phase 2's sketch is structurally a small JEPA.

Writing this plan now consolidates all three into one design — without committing to implementation. The risk of NOT writing it is that three months from now someone could plausibly start the Phase 2 MLP without seeing it's the same problem cross_modal_substrate_binding Stage 4a couldn't solve and roy_5's Stage 4b deferred. Worst case: three half-implementations of the same alignment learner.

Load-bearing rules baked into the plan

No pretrained-model imports. Projection weights come from Roy-priming-derived training data only (no CLIP/ImageBind transfers).
No central hand-curated (sensor, word) lexicon. Per feedback_interim_contamination.md.
Contamination guard is a CI test, not a convention. A tests/unit/test_jepa_no_contamination.py (planned) constructs synthetic curated pairs and FAILS the build if the training loader accepts pairs without substrate-provenance tags.
No replacement for SensorEncoder or LinguisticEncoder. JEPA adds a projection ON TOP; existing encoders are unchanged.
use_projection=False default. EC's pattern_complete_or_separate gains an opt-in flag; existing call sites are unaffected.
Earliest possible landing is 1.2. No section of v1_refinement.md should depend on JEPA.

Plan structure

7 stages (~1,250 LOC, 8-12 weeks once unblocked):

Stage	Item	Prereq
0	Data audit — confirm Stage 3 cradle redesign produces sufficient paired training data	Stage 3 of roy_5 shipped
1	Projection module + persistence	Stage 0 PASS
2	Training pipeline + contamination-detector test	Stage 1
3	Encoder integration (additive `embed_projected`)	Stage 2
4	EC integration — `use_projection` parameter	Stage 3
5	Roy-5c validation iteration	Stage 4
6	Hivemind shareability (conditional)	Stage 5 PASS + Hivemind 1.1 ships

Out of scope

Implementation work. This is a plan-only PR. No code, no tests, no infrastructure.
JEPA Stage 0 data audit. Runs when Stage 3 of roy_5 ships — not now.
Changes to existing plans. README updated to acknowledge the new plan + clarify cross_modal_substrate_binding's resurrection conditions; the individual plans aren't edited.

Test plan

Plan reads coherently end-to-end (self-check).
No code changes — ruff check / pytest are trivially clean.
All cross-plan references resolve (verified during writing).
User review for scope + sequencing.

🤖 Generated with Claude Code

Roy-5a-substrate-on (PR #251) surfaced a structural finding the plan it was meant to resolve did not model: SensorEncoder (384-dim) and LinguisticEncoder (768-dim) embed in different-dimensional spaces. Cross-modality cosine is mathematically undefined; any cosine-based cross-modal alignment is structurally impossible without a learned projection layer. This plan consolidates three previously-disjoint sketches that were each working around the same gap without acknowledging each other: - cross_modal_substrate_binding.md Stage 4a — Hebbian binding on raw cosine. Cancelled by Roy-4 because cosine on different-dim vectors is undefined. - roy_5_encoder_alignment_disambiguator.md Stage 4b — "encoder replacement to 1.2+ research direction." JEPA is the bio-defensible answer. - grounded_language_acquisition.md Phase 2 — symbol-binding layer sketched as "small MLP, or a tiny RNN." Phase 2's sketch is structurally a one-modality JEPA. The plan stays DRAFT until roy_5_encoder_alignment_disambiguator.md Stage 3 (cradle-arc redesign) ships and produces sufficient paired training data. Stage 0 of this plan is a ~50 LOC data audit on Roy-5b's existing data; only after that PASSES does any code work begin. Load-bearing rules captured in the plan: - No pretrained-model imports (no CLIP/ImageBind transfers). - No central hand-curated (sensor, word) lexicon. - Contamination guard is a CI test, not a convention. - JEPA adds a projection on TOP of the existing encoders; SensorEncoder + LinguisticEncoder are unchanged. - use_projection=False default; opt-in flag on EC. - Earliest possible landing is 1.2. README also updated with the 1.1 track triangle: - roy_5_encoder_alignment_disambiguator.md → Stage 1 SHIPPED - cross_modal_substrate_binding.md → CANCELLED by Roy-4 - jepa_cross_modal_alignment.md → DRAFT Total ~1,250 LOC over 8-12 weeks once unblocked. On par with the 1.0 cleanup wave's scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dennys246 merged commit cd51be5 into main May 15, 2026
5 checks passed

dennys246 deleted the docs/plans-jepa-cross-modal-alignment branch May 15, 2026 05:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(plans): add jepa_cross_modal_alignment.md (1.2+ research direction)#252

docs(plans): add jepa_cross_modal_alignment.md (1.2+ research direction)#252
dennys246 merged 1 commit into
mainfrom
docs/plans-jepa-cross-modal-alignment

dennys246 commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dennys246 commented May 15, 2026

Summary

Why this plan exists

Load-bearing rules baked into the plan

Plan structure

Out of scope

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant