feat(0.9.1): Wire-A — cluster-bias annotation prompt section by dennys246 · Pull Request #253 · dennys246/Maxim

dennys246 · 2026-05-15T18:40:43Z

Summary

Stage 2 of release_0_9_1.md. The interim signal-surfacing fix for the Roy-2c finding: NAc's _cluster_reward_bias map has the right tool keys (tool:sense_food_source) but the wrong cluster keys (priming clusters are structurally disjoint from test-fixture clusters under the LinguisticEncoder gap). The llm-primary proposer never reads the cluster-keyed bias map otherwise.

Wire-A renders agent-wide aggregated NAc reward bias per tool into the LLM prompt at IMPORTANT priority. The aggregation is deliberately agent-wide (NOT active-cluster-restricted) per the plan's 2026-05-13 revision — restricting to the active-cluster intersection reproduces exactly the bug Wire-A exists to fix.

What ships

File	LOC	Purpose
`src/maxim/decisions/nac.py`	+89	`NAc.get_agent_tool_biases(agent_id=, top_n=5)` + `NAc.decay_cluster_reward_biases()` (per-tick decay, added in fold)
`src/maxim/prompts/cluster_bias_annotation.py`	+176 (new)	Composer + 5-band bias-to-text mapper + shared `TRUTHY_DISABLE_VALUES` constant + `annotation_disabled_via_env` parser
`src/maxim/agents/bus.py`	+10	`StructuredContext.cluster_bias_annotations: list[tuple[str, float]] \| None` field
`src/maxim/agents/prompt_builder.py`	+41	`PromptBuilder._add_cluster_bias_annotation_section` at IMPORTANT priority, between `entity_context` and `guidance`
`src/maxim/runtime/agent_loop.py`	+60	Producer site reads `MAXIM_DISABLE_CLUSTER_BIAS_ANNOTATION`, calls `nac.get_agent_tool_biases`. Per-tick `decay_cluster_reward_biases()` (fold). Narrow `except ValueError` (fold) — loud-fail on misconfigured per-agent stash
`tests/conftest.py`	+23	Autouse scrub for `MAXIM_DISABLE_CLUSTER_BIAS_ANNOTATION`
`tests/unit/test_wire_a_cluster_bias_annotation.py`	+464 (new)	47 tests, 6 layers
`CLAUDE.md`	+3	Env-var table entry

Total: +906 / -1, 8 files.

Two-lens pre-merge review

Per feedback_review_before_ship.md, spawned parallel architecture + bio-fidelity reviews before opening this PR. No critical blockers from either lens. Folded one Critical + four Important findings into commit bee42ca:

Lens	Finding	Severity	Fix
Bio	`_cluster_reward_bias` had no per-tick decay → Wire-A becomes a permanent fossil	Critical	`NAc.decay_cluster_reward_biases()` + wired into per-tick block alongside the other decay calls. 4 new tests.
Arch	`except Exception` swallows misconfigured-agent-id errors that should surface for Roy-3 integrity	Important	Narrowed to `except ValueError` with WARNING log.
Arch (cross-confirmed)	Truthy-set duplicated between producer + test	Important	Extracted `TRUTHY_DISABLE_VALUES` + `annotation_disabled_via_env()` shared parser.
Bio	5-band thresholds are hand-coded — needs explicit bio-defensible-bands audit trail	Important	Added "Why hand-coded bias bands are bio-defensible" section to module docstring.
Arch	Producer-site integration was untested end-to-end	Important	Added `TestProducerSiteSemantics` (4 tests).
Polish	`max([])` defensive `default=0`	Nice-to-have	Done.

Deferred (subjective / out-of-scope):

Annotation-tone refinement ("Substrate associations" vs "Felt familiarity") — subjective; user override later.
Budget-eviction regression test — natural surface is Roy-3, not a unit test.
Section name singular/plural mismatch — cosmetic.

Frozen contract impact

None. _cluster_reward_bias already persists; we're adding a read path + per-tick decay path. StructuredContext is a plain @dataclass (not frozen). No new persisted dataclass fields, no _format_version bumps, no signature changes on frozen types.

Stage 0b/0c dependency

Wire-A reads _cluster_reward_bias which is populated by the existing update_cluster_reward write path. Stages 0b/0c are Roy-3 MEASUREMENT telemetry only — no runtime dependency. Shipping Wire-A first is safe; 0b/0c can land in a separate PR before Roy-3 begins.

Test plan

python -m pytest tests/unit/test_wire_a_cluster_bias_annotation.py -v — 47 passed (8 NAc aggregation + 8 bias-band boundary + 5 composer + 3 PromptBuilder helper + 15 env-var gate + 4 decay + 4 producer-site).
python -m pytest tests/unit/test_wire_a_cluster_bias_annotation.py tests/unit/test_nac*.py tests/unit/test_prompt_builder*.py -q — 172 passed (no regression).
python -m pytest tests/ -x -q -m "not slow" --ignore=tests/integration/test_memory_hub.py — 6607 passed (full fast suite before fold; fold added pure test additions).
ruff check + ruff format clean on touched files.
Roy-2c regression guard (test_disjoke_cluster_regression_guard): a sim where active EC clusters are DISJOINT from priming clusters MUST still produce the annotation block. Pinned.
Roy-3 validation iteration — runs AFTER Wire-A + Wires 1+2+3 ship, per release_0_9_1.md Stage 5. NOT this PR.

What's next in 0.9.1

Per release_0_9_1.md:

✅ Stage 0a (Roy-2c probe) — shipped earlier
⏳ Stages 0b/0c (telemetry) — separate follow-up
⏳ Stage 1 (Wire 3: embodiment-state → tool filter)
✅ Stage 2 (Wire-A) — this PR
⏳ Stage 3 (Wire 2: Pavlovian percept aversion)
⏳ Stage 4 (Wire 1: risk-sensitive action annotation)
⏳ Stage 5 (Roy-3 validation)

🤖 Generated with Claude Code

Stage 2 of release_0_9_1.md. The interim signal-surfacing fix for the Roy-2c finding: NAc has the right tool keys (sense_food_source) but wrong cluster keys (priming clusters are structurally disjoint from test-fixture clusters under the LinguisticEncoder gap). The llm-primary proposer never sees the cluster-keyed bias map otherwise. Wire-A renders agent-wide aggregated NAc reward bias per tool into the LLM prompt at IMPORTANT priority. The aggregation is deliberately agent-wide (not active-cluster-restricted) because Roy-2c proved active clusters disjoint from priming clusters; restricting Wire-A to the active-cluster intersection reproduces exactly the bug it exists to fix. Per the plan's revised 2026-05-13 design: - NAc.get_agent_tool_biases(agent_id=, top_n=5): aggregates across all clusters per agent, keeping max(|bias|) per tool, returning sorted by |bias| descending with stable tie-breaker. - compose_cluster_bias_annotation_section(): renders top-N as a structured block with 5 bands (strongly rewarding / mildly rewarding / neutral / mildly aversive / strongly aversive). All-neutral blocks are skipped (no signal); mixed blocks preserve neutral entries so the ordering itself is informative. - StructuredContext.cluster_bias_annotations field: additive, populated upstream at the LLM-submission site in agent_loop.py. - PromptBuilder._add_cluster_bias_annotation_section: reads the context field, renders via the composer, adds to budgeter at IMPORTANT priority (between entity_context and guidance sections). - Producer site at agent_loop.py:~2761: reads MAXIM_DISABLE_CLUSTER_BIAS_ ANNOTATION env var (default OFF = annotation ON). Truthy values (1/true/t/yes/y/on, case-insensitive) disable the annotation for the Roy-3 ablation iteration. - Conftest autouse scrub per CLAUDE.md "opt-in env vars in hot startup paths need autouse scrubs" rule. Frozen contract impact: none. _cluster_reward_bias already persists; we're adding a read path. No new dataclass fields on frozen types (StructuredContext is mutable). Test surface (34 tests, 4 layers): - Layer 1 (NAc.get_agent_tool_biases): 8 tests including the Roy-2c regression guard (disjoint-cluster scenario must still produce annotation), max(|bias|) aggregation, per-agent isolation, empty agent_id rejection. - Layer 2 (bias_to_band + composer): 10 tests pinning all 5 band boundaries (0.5 / 0.1 / -0.1 / -0.5 inclusive/exclusive semantics), tool: prefix stripping, mixed-neutral preservation. - Layer 3 (PromptBuilder helper): 3 tests covering None/empty short-circuit + IMPORTANT priority. - Layer 4 (env-var gate semantics): 12 parametric tests covering the truthy-value match table the producer reads. All 34 Wire-A tests pass. Ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two-lens pre-merge review of feat/0-9-1-wire-a landed one Critical bio-fidelity finding and four cross-confirmed Important findings. No blockers from either lens; folding before opening the PR per feedback_review_before_ship.md. Critical (bio-fidelity): - _cluster_reward_bias had no per-tick decay. Without it, Wire-A's annotation becomes a permanent fossil of every reward the substrate ever saw — claiming "from prior experience" while actually being "from forever ago." The bio reviewer correctly flagged this as a by-accretion contamination of the substrate-voice thesis. Fix: NAc.decay_cluster_reward_biases() mirrors decay_goal_reward_biases (bidirectional, abs-value prune below 0.001), wired into the per-tick decay block in agent_loop.py alongside decay_reward_biases() and decay_goal_reward_biases(). 4 new tests in TestClusterBiasDecay covering shrinkage, prune threshold, negative-bias preservation, empty-map no-op. Important (architecture): - except Exception in the producer was a band-aid. Per CLAUDE.md "loud-failure recurring bugs can stay on helper-discipline; silent- failure bugs in correctness-critical paths jump to structural enforcement." Narrowed to except ValueError so an invalid agent_id surfaces as WARNING, not silent annotation-off. Other exceptions propagate — the producer is on the LLM-submission hot path and a real bug here is Roy-3 evidence-integrity-critical. - Truthy-set definition was duplicated between agent_loop.py and the test. Extracted to TRUTHY_DISABLE_VALUES + annotation_disabled_via_ env() in prompts/cluster_bias_annotation.py. The conftest scrub, the producer, and the test suite now all consult the same constant. Important (cross-confirmed): - The 34-test suite verified the truthy-set mathematically but did not exercise the producer-block end-to-end. New TestProducerSiteSemantics covers: producer-populates-context, empty-agent-id-raises-ValueError, env-var-disable-short-circuits, no-NAc-skips-quietly. Important (bio-fidelity): - The 5-band bias-to-text mapping is hand-coded thresholds. Added a "Why hand-coded bias bands are bio-defensible" section to the module docstring: bias values are substrate-earned via reward_bias_alpha accumulation; band labels are display-layer translation; no band-derived signal flows back into encoders / EC / NAc / any substrate write path. Closes the contamination audit trail per feedback_interim_contamination.md. Nice-to-have: - compose_cluster_bias_annotation_section's max() now uses default=0 so a future-caller predicate filtering rendered to empty doesn't raise. - Producer docstring clarifies the deliberation-cycle reuse pattern: Wire-A's read fires once per build_context, not per submission, so cycles 2-N see the snapshot at first submission. Bias change within a 1-tick deliberation cycle is bounded by alpha=0.15 per call — the stale-by-one-tick read is acceptable and avoids per-cycle NAc lock contention. Total test count: 47 (up from 34). All passing. Deferred to follow-up: - Annotation-tone refinement ("substrate associations" vs "felt familiarity") — subjective rendering decision; defer to user. - Budget-eviction regression test under WMS-at-capacity — natural surface test is Roy-3, not a unit test. - Section name singular/plural mismatch (cosmetic). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dennys246 and others added 2 commits May 15, 2026 09:57

dennys246 mentioned this pull request May 16, 2026

feat(0.9.1): Stages 0b + 0c — action JSONL telemetry + recommend_action emission #254

Merged

5 tasks

Merge branch 'main' into feat/0-9-1-wire-a

e1c0382

dennys246 merged commit 629745a into main May 16, 2026
5 checks passed

dennys246 deleted the feat/0-9-1-wire-a branch May 16, 2026 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(0.9.1): Wire-A — cluster-bias annotation prompt section#253

feat(0.9.1): Wire-A — cluster-bias annotation prompt section#253
dennys246 merged 3 commits into
mainfrom
feat/0-9-1-wire-a

dennys246 commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dennys246 commented May 15, 2026

Summary

What ships

Two-lens pre-merge review

Frozen contract impact

Stage 0b/0c dependency

Test plan

What's next in 0.9.1

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant