Skip to content

test(vcr-ra): shadow allocator vs recovery ladder — spurious vs genuine spill (#242)#479

Merged
avrabe merged 1 commit into
mainfrom
vcr-ra/shadow-vs-ladder
Jun 25, 2026
Merged

test(vcr-ra): shadow allocator vs recovery ladder — spurious vs genuine spill (#242)#479
avrabe merged 1 commit into
mainfrom
vcr-ra/shadow-vs-ladder

Conversation

@avrabe

@avrabe avrabe commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Measure-only (SYNTH_SHADOW_ALLOC, byte-identical — frozen gate green). Answers the VCR-RA acceptance question empirically: do the recovery ladder's spills survive under a verified (virtual-register) allocator?

Running the graph-colouring shadow allocator (the VCR-RA prototype) on the ladder-firing functions splits them:

function recovery rung shadow peak verdict
control_step.wasm spill 8 ≤ 9 spurious — VCR-RA subsumes
high_pressure_i32 spill 10 > 9 genuine — VCR-RA spills too
promotion_exhaustion_fallback spill 10 > 9 genuine
high_pressure_i64, msgq_put_359 param-back / spill declined i64/calls — model scope gap

Key result: control_step — a shipped frozen fixture (0x00210A55) — spills only as a physical-register artifact (its 8 live values fit the 9-wide pool once allocated virtually), exactly the case a verified allocator removes. The genuine floor is the peak-10 functions, where VCR-RA's bar is "spill no worse," not "no spill." The i64/call decline is the shadow model's TODO, not an allocation result.

Adds the measured comparison to register_exhaustion_recovery_ladder.md + the CI test shadow_alloc_spurious_vs_genuine_spill_242 (pins the ≤9 / >9 threshold, not exact peaks). No production code changed.

Refs #242

🤖 Generated with Claude Code

…ne spill (#242)

Measure-only (SYNTH_SHADOW_ALLOC, byte-identical) — answers the VCR-RA acceptance
question empirically: do the recovery ladder's spills survive under a verified
(virtual-register) allocator? Running the graph-colouring shadow allocator on the
ladder-firing functions splits them:
  control_step.wasm              spill -> shadow peak 8 <= 9  SPURIOUS (VCR-RA subsumes)
  high_pressure_i32              spill -> shadow peak 10 > 9  GENUINE  (VCR-RA spills too)
  promotion_exhaustion_fallback  spill -> shadow peak 10 > 9  GENUINE
  high_pressure_i64 / msgq_put   -> shadow declines (i64/calls — model scope gap)

Key result: control_step — a SHIPPED frozen fixture — spills only as a
physical-register artifact (its 8 live values fit the 9-wide pool once allocated
virtually), exactly the case a verified allocator removes. The genuine floor is the
peak-10 functions, where VCR-RA's bar is "spill no worse," not "no spill."

Adds the measured comparison to the recovery-ladder map + the CI test
shadow_alloc_spurious_vs_genuine_spill_242 (pins the ≤9 / >9 threshold, not exact
peaks, so codegen drift won't flap it). No production code changed; frozen gate green.

Refs #242

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@avrabe avrabe merged commit 342d8e5 into main Jun 25, 2026
15 checks passed
@avrabe avrabe deleted the vcr-ra/shadow-vs-ladder branch June 25, 2026 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant