feat(0.9.1): Wire 3 — embodiment-state → tool filter by dennys246 · Pull Request #255 · dennys246/Maxim

dennys246 · 2026-05-17T03:47:57Z

Summary

Stage 1 of release_0_9_1.md (lifted from bio_emergent_persona_foundations.md § Wire 3). The cleanest "emergent trait" demonstration in the foundations plan: an agent with a damaged body component visibly stops attempting affordances routed through it. Physical history shapes behavior without prompt-injection scaffolding.

What ships

Component	Where	LOC
`Embodiment.get_disabled_affordances` (filtered from prompt)	`src/maxim/embodiment/body.py`	+51
`Embodiment.get_degraded_affordances` (annotated in prompt)	same	+24
`Embodiment.integrity_to_felt_phrase` (felt-sensation mapper)	same	+20
Agent_loop filter + annotate hook (with `WIRE_3_FILTER` JSONL emission)	`src/maxim/runtime/agent_loop.py`	+118
Tests (41 tests, 9 layers)	`tests/unit/test_wire_3_embodiment_filter.py`	+725 (new file)

Total: ~440 LOC source + 725 LOC test = 1,165. 41 unit tests + 163 module regression passing. Full fast suite (pre-fold): 6682 passed.

Band semantics

Integrity	State	LLM-visible
`integrity < 0.3`	disabled — filtered from `available_tools`	tool absent from prompt
`0.3 ≤ integrity < 0.45`	degraded, low	`(feels weakened, prone to failing)`
`0.45 ≤ integrity < 0.6`	degraded, high	`(feels strained)`
`integrity ≥ 0.6`	healthy	unchanged

The bands partition [0, 1] cleanly — no overlap, no gap. Pinned by TestBandPartitioning (8 parametric tests).

Two-lens pre-merge review

Per feedback_review_before_ship.md. Both reviews ran in parallel before opening this PR; 0 Critical from either lens. Folded 8 Important findings into commit 8352106:

Bio-fidelity findings (3 Important):

Finding	Fix
B2 (highest signal) — pre-filter silently bypasses the natural failure → pain → NAc learning chain for disabled tools; Roy-3 can't disambiguate "Wire 3 hid the tool" from "substrate learned avoidance"	Emit `sim_log("WIRE_3_FILTER", ...)` per LLM submission with disabled tools list + degraded integrities. Tick-aligned with Stages 0c/0d.
B3 — annotation reads as system-voice metric badge (`[DAMAGED: integrity 0.X]`) rather than substrate's proprioceptive percept	Felt-sensation phrasing via `integrity_to_felt_phrase()`. Numeric integrity stays in JSONL event for Roy-3; LLM sees `(feels strained)` / `(feels weakened, prone to failing)` only.
B5 — `compute_integrity` raise silently swallowed → "unknown state" conflated with "healthy state"	WARNING log with entity + modulator name, loop stability preserved (still fail-open to 1.0).

Architecture findings (5 Important):

Finding	Fix
A1 — idempotency guard fails when integrity drifts across band boundaries (NAc + modulator repair can recover integrity multi-tick)	Regex strip `_WIRE3_PHRASE_RE` before re-applying current-tick phrase. Pinned by `test_annotation_idempotent_under_integrity_drift`.
A4 — broad `except Exception` at DEBUG level swallows method-shape mismatches	Narrowed to `(AttributeError, TypeError)` at WARNING level so Roy-3 validation surfaces shape bugs.
A5 — no test pins that annotation actually reaches the LLM-visible prompt string	`TestLlmRenderingRoundTrip` (2 tests) constructs `LLMRequest` with post-Wire-3 `tool_descriptions`, asserts `build_tools_section_filtered` produces a prompt containing the felt phrase.
A3 — plan-language ambiguity (`integrity < 0.6 annotates`)	Docstring + inline-comment pin on `Embodiment` class.
B1 — I/O-boundary audit trail	Top-of-section comment explicitly states Wire 3 thresholds gate the LLM proposer's tool surface, NOT substrate encoding.

Deferred (3 nice-to-haves): structured annotation surface for budgeter awareness, registry-walk tool-name lookup, capability-composition dedup. None blocking — Roy/cradle/Reachy single-body topologies don't hit any of them.

Frozen contract impact

None. No new persisted state, no _format_version bumps, no dataclass changes. Pure read-side wiring per the plan.

Behavioral signal + Roy-3 measurement

The behavioral signal is what the plan calls out: a damaged-arm agent stops calling arm-routed affordances. Roy-3 measurement is now disambiguable thanks to the bio-fidelity fold:

Pre-Wire-3 baseline: damaged-arm agent keeps calling arm-routed affordances, fails via the SEM requires precondition, learns avoidance via NAc reward bias over many failures.
Post-Wire-3: damaged-arm agent never calls disabled arm-routed affordances; degraded-arm agent reads felt-sensation phrasing and adjusts. WIRE_3_FILTER JSONL event lets Roy-3 count exactly which tools were filtered each tick.

Test plan

python -m pytest tests/unit/test_wire_3_embodiment_filter.py -v — 41 passed (9 layers: get_disabled / get_degraded / band partitioning / tool-name pattern / hook shape / degenerate cases / felt phrases / WIRE_3_FILTER emission / LLM round-trip).
python -m pytest tests/unit/test_wire_3_embodiment_filter.py tests/unit/test_embodiment_failures.py tests/unit/test_embodiment_sem.py tests/unit/test_prompt_builder.py -q — 163 passed (no regression).
python -m pytest tests/ -x -q -m "not slow" --ignore=tests/integration/test_memory_hub.py — 6682 passed (full fast suite, pre-fold; fold added pure test additions).
ruff check + ruff format clean on touched files.
Roy-2c regression guard — Wire 3 is downstream of substrate encoding (per the bio-fidelity B1 audit-trail docstring), so the Roy-2c finding fix in Wire-A is preserved.
Next: Wire 2 (Stage 3), Wire 1 (Stage 4), then Roy-3 validation (Stage 5).

What's next in 0.9.1

Per release_0_9_1.md:

✅ Stage 0a (Roy-2c probe)
✅ Stages 0b + 0c (telemetry)
✅ Stage 1 (Wire 3) — this PR
✅ Stage 2 (Wire-A)
⏳ Stage 3 (Wire 2: Pavlovian percept aversion)
⏳ Stage 4 (Wire 1: risk-sensitive action annotation)
⏳ Stage 5 (Roy-3 validation)

🤖 Generated with Claude Code

Stage 1 of release_0_9_1.md, lifted from bio_emergent_persona_foundations.md § Wire 3. The smallest of the four wires by LOC, highest behavioral signal per unit work: an agent with a damaged arm visibly stops attempting arm-routed affordances without any prompt-injection scaffolding. The cleanest emergent "trait" demonstration in the foundations plan. Implementation: - Embodiment.get_disabled_affordances(*, threshold=None) → set[str] Walks the entity tree, computes each modulator's compute_integrity(), and returns base tool names ({entity.name}_{affordance_name}) for modulators strictly below the disable threshold (default 0.3). Modulators that don't expose compute_integrity (capability-only, decorator-style) default to integrity=1.0 per the backward-compat convention SpecModulator.compute_integrity already uses on empty vital_metrics. A buggy modulator whose compute_integrity raises is treated as integrity=1.0 (fail-open). - Embodiment.get_degraded_affordances(*, disable_threshold=None, degrade_threshold=None) → dict[str, float] Same walk, returns {base_tool_name: integrity} for modulators in [disable_threshold, degrade_threshold) (default [0.3, 0.6)). The bands partition [0, 1] cleanly — every affordance lands in exactly one of {disabled, degraded, healthy}, never both. - agent_loop.py hook between mode_info.get_available_tools(...) and the tool_descriptions build loop: 1. Filter disabled affordances out of available_tools. 2. After per-tool description build, append "[DAMAGED: integrity 0.X]" to each degraded affordance's description. Fail-open: no embodiment, missing compute_integrity, raising modulator → the hook is a no-op. Description annotation is idempotent (the "if annotation not in base_desc" guard prevents the suffix accumulating across ticks). Copy-on-write — TOOL_DESCRIPTIONS is a shared module-level dict; mutation would poison future calls and other agents. Tool-name pattern (the load-bearing assumption): - tool_bridge.generate_tools_for_entity registers ModulatorAffordanceTools as {entity.name}_{affordance_name} unless _resolve_tool_name collision-prefixes an ancestor name. - Roy / cradle / Reachy use single-body topologies — no collisions in practice. Wire 3's base names match the registered tool names cleanly. Under a hypothetical collision, the filter fails open (the tool stays available; no silent mis-gating). - TestToolNamePattern.test_base_name_matches_generated_tool_name pins this contract: it constructs a real ToolRegistry, calls generate_tools_for_entity, and asserts every predicted-disabled name is in the registry. Test surface (30 tests, 6 layers): - Layer 1 (get_disabled_affordances, 7 tests): healthy/critical/ boundary/per-modulator-isolation/threshold-override. - Layer 2 (get_degraded_affordances, 7 tests): boundary semantics for both ends of the [0.3, 0.6) band; threshold overrides. - Layer 3 (band partitioning, 8 parametric tests): no integrity value lands in both disabled and degraded sets. - Layer 4 (tool-name pattern, 1 test): live ToolRegistry round-trip via generate_tools_for_entity. - Layer 5 (agent_loop hook shape, 4 tests): filter/annotate/ idempotency/no-embodiment-no-op. - Layer 6 (degenerate cases, 3 tests): empty modulators, empty affordances, raising compute_integrity. All 30 tests pass. Ruff clean. Frozen contract impact: none. No new persisted state, no dataclass changes. Pure read-side wiring per the plan. Behavioral signal: a damaged-arm agent stops calling arm-routed affordances. Roy-3 validation can measure this once Wires 1-A all ship. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two-lens pre-merge review of feat/0-9-1-wire-3-embodiment-tool-filter surfaced 0 Critical and 8 Important findings (3 bio, 5 arch). All folded before opening the PR per feedback_review_before_ship.md. 41 tests passing (up from 30 — 11 new fold-regression guards across 3 new test layers). Bio-fidelity findings folded: - **B2 (highest signal): emit `sim_log("WIRE_3_FILTER", ...)` per tick.** The pre-filter silently bypasses the natural substrate- learning chain (failure → pain → NAc) for disabled tools — without observability, Roy-3 can't disambiguate "Wire 3 hid the tool" from "substrate learned avoidance." The emission lists disabled tools + degraded integrities per LLM submission, gated on disabled OR degraded being non-empty. Tick aligned with Stages 0c/0d (int(time.time() - sim_logger._sim_start)). Fail-soft on ImportError for non-sim runtime paths. Pinned by TestWire3FilterEmission.test_emission_lists_disabled_and_degraded. - **B3: felt-sensation phrasing instead of metric badge.** The pre-fold annotation read `[DAMAGED: integrity 0.X]` — a system- voice metric badge. Post-fold uses `Embodiment.integrity_to_felt_phrase()` to map degraded-band integrity to proprioceptive percept ("feels strained" / "feels weakened, prone to failing"). The numeric integrity stays in the WIRE_3_FILTER JSONL event for Roy-3 analysis; the LLM sees the qualitative phrase only. Two bands within the degraded range: [0.45, 0.6) → "feels strained"; [0.3, 0.45) → "feels weakened, prone to failing". Pinned by TestIntegrityToFeltPhrase (6 tests). - **B5: WARNING log on compute_integrity raise.** Pre-fold the inner try/except in `_iter_modulator_affordance_pairs` silently swallowed every exception and treated the modulator as healthy (1.0). Per the no-band-aid rule (CLAUDE.md), this conflates "unknown state" with "healthy state" — a body whose self- monitoring is broken is itself in trouble. Post-fold logs a WARNING with the entity + modulator name so the broken modulator surfaces during Roy-3 / operator review. Loop stability is preserved (still fail-open to 1.0). Pinned by the updated test_compute_integrity_raises_treated_as_healthy with a caplog WARNING assertion. - **B1: I/O-boundary docstring pin.** Added a top-of-section comment on the Wire 3 threshold constants explicitly stating the thresholds gate the LLM-proposer's tool surface, NOT substrate encoding (EC/ATL/NAc are upstream). Mirrors the Wire-A bias-band bio-defensible-bands audit trail. Architecture findings folded: - **A1: regex-strip felt annotation before re-append.** Pre-fold idempotency guard was `if annotation not in base_desc` — but if integrity drifts across ticks (NAc reward learning + modulator repair can recover integrity over multi-tick windows: 0.55 → 0.40 → 0.55), the felt phrase shifts band and both suffixes accumulate. Post-fold uses `_WIRE3_PHRASE_RE.sub("", base_desc)` before re-applying the current-tick phrase — exactly one suffix on the description regardless of drift history. Pinned by test_annotation_idempotent_under_integrity_drift. - **A4: narrow except Exception to (AttributeError, TypeError) + WARNING log.** Pre-fold the hook caught broad Exception at DEBUG level. The body.py inner guard already swallows compute_integrity raises; the outer surface failures here are method-shape mismatches (non-Embodiment object plugged into executor.embodiment). Post-fold narrows to (AttributeError, TypeError) at WARNING level so the failure surfaces during Roy-3 validation runs. - **A3: docstring pins for band semantics.** Added inline comment block on the Wire 3 threshold constants documenting "integrity < 0.3 disables (strict); 0.3 <= integrity < 0.6 degrades (inclusive)" — closes the architecture-lens nit about the plan's ambiguous wording ("integrity < 0.6 annotates" naturally reads as ALSO including the disabled range, but only the [0.3, 0.6) band reaches the annotation path). - **A5: LLM-rendering round-trip test.** New TestLlmRenderingRoundTrip class with 2 tests. First test constructs an LLMRequest with the post-Wire-3 tool_descriptions dict and calls build_tools_section_filtered — asserts the felt- sensation phrase reaches the LLM-visible prompt string. Second test pins that disabled tools (filtered out of available_tools by Wire 3) DO NOT appear in the rendered tool section. Without these tests, a future refactor of the prompt-section renderer could silently drop Wire 3's signal. Deferred (architecture nice-to-haves N1/N2/N3): - Structured annotation surface for budgeter awareness (separate `damage_annotation` field) — single agent / small prompts in 0.9.1 don't hit budget pressure. - Registry-walk tool-name lookup (instead of `{entity.name}_ {affordance_name}` reconstruction) — single-body topologies don't trigger collisions; structural fix deferred to first multi-body sim that hits one. - `_iter_modulator_affordance_pairs` dict-collision dedup — capability composition isn't a 0.9.1 topology. Total +1 file changed (agent_loop.py +66/-12), +1 file changed (body.py +57/-1), +1 file changed (test file +269/-30). 41 tests passing. Ruff clean on touched files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dennys246 and others added 2 commits May 16, 2026 21:22

dennys246 merged commit 51dfd38 into main May 17, 2026
5 checks passed

dennys246 deleted the feat/0-9-1-wire-3-embodiment-tool-filter branch May 17, 2026 04:05

dennys246 mentioned this pull request May 17, 2026

feat(0.9.1): Wire 1 — risk-sensitive action annotation (Stage 4) #257

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(0.9.1): Wire 3 — embodiment-state → tool filter#255

feat(0.9.1): Wire 3 — embodiment-state → tool filter#255
dennys246 merged 2 commits into
mainfrom
feat/0-9-1-wire-3-embodiment-tool-filter

dennys246 commented May 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dennys246 commented May 17, 2026

Summary

What ships

Band semantics

Two-lens pre-merge review

Frozen contract impact

Behavioral signal + Roy-3 measurement

Test plan

What's next in 0.9.1

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant