feat(vcr-ra): ship immediate-shift folding default-on — v0.15.0 (#390, #242)#467
Merged
Merged
Conversation
…#242) Flips the immediate-shift folding peephole default-on (PR #463 landed it flag-off). A constant shift amount the stack selector materialized into a scratch register (`movw rM,#C; lsl rD,rN,rM`) now folds to the immediate form (`lsl rD,rN,#C`), removing the dead `movw` — −1 instruction, −1 live register per folded shift. Flip: arm_backend.rs default-ON with opt-out SYNTH_NO_IMM_SHIFT_FOLD=1. Re-froze the ARM goldens (control_step 316→304, flight_seam 866→774, flight_seam_flat 1006→910 = −200 B; signed_div_const unchanged — no register-shift folds). RV32 gate untouched (ARM-only peephole). Results preserved across the byte change: control_step 0x00210A55 (differential 13/13), flat+inlined flight_algo 0x07FDF307 (MATCH); opt-out restores the v0.14.0 bytes; full workspace suite green. Validated bit-identical + a net cycle win on the dissolved hot path (−2 cyc/call, .text 100→90 B on gust_mix). Cumulative dissolved hot-path: 64.0 → 58.0 (cmp→select) → 50.0 (local promotion) → 48.0 cyc/call. Pin-swept 0.14.0→0.15.0; CHANGELOG added. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Flips the immediate-shift folding peephole default-on (PR #463 landed it flag-off, gate now cleared). A constant shift amount the stack selector materialized into a scratch register (
movw rM,#C; lsl rD,rN,rM) folds to the immediate form (lsl rD,rN,#C), dropping the deadmovw— −1 instruction, −1 live register per folded shift.Re-froze ARM goldens
control_step 316→304, flight_seam 866→774, flight_seam_flat 1006→910 (−200 B); signed_div_const unchanged (no register-shift folds). RV32 gate untouched (ARM-only peephole).
Validation (results preserved)
0x00210A55differential 13/13; flat+inlined flight_algo0x07FDF307MATCH.SYNTH_NO_IMM_SHIFT_FOLD=1) restores the v0.14.0 bytes; full workspace suite green; fmt/clippy/pin-sweep clean..text100→90 B on gust_mix).Cumulative dissolved hot-path: 64.0 → 58.0 (cmp→select) → 50.0 (local promotion) → 48.0 cyc/call.
Pin-swept 0.14.0→0.15.0; CHANGELOG added.
🤖 Generated with Claude Code