Skip to content

feat: implement vm execution instance that uses rvr#2747

Open
gdmlcjs wants to merge 6 commits intochore/rvr-migratefrom
feat/rvr-instance
Open

feat: implement vm execution instance that uses rvr#2747
gdmlcjs wants to merge 6 commits intochore/rvr-migratefrom
feat/rvr-instance

Conversation

@gdmlcjs
Copy link
Copy Markdown

@gdmlcjs gdmlcjs commented Apr 23, 2026

  • Vm execution instance is made to use rvr execution, depending on the feature. Helper functions to convert between the existing VmState and the rvr state are also added.
  • The VmConfig macro now has a create_rvr_extensions method implementation, but instead of defining a new VmRvrConfig trait, the create_rvr_extensions method piggybacks on the existing VmExecutionConfig. This is to avoid complex feature-gated trait bounds.

closes INT-6810, INT-7476

@gdmlcjs gdmlcjs self-assigned this Apr 23, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 23, 2026

Claude finished @gdmlcjs's task in 2m 38s —— View job


Code Review: PR #2747 — feat: implement vm execution instance that uses rvr

  • Gather context and read changed files
  • Review VM core changes (crates/vm/)
  • Review SDK and config changes
  • Review test utilities and extension changes
  • Review derive macro changes
  • Post comprehensive review

Summary

This PR introduces three new RVR-backed execution instance types (RvrPureInstance, RvrMeteredInstance, RvrMeteredCostInstance) that mirror the existing interpreter-based instances but use the RVR backend. It also moves create_rvr_extensions from a standalone VmRvrExtension trait into the VmExecutionConfig trait, simplifying the derive macro.

Overall the architecture direction is sound — having RVR instances that wrap the existing compile/execute helpers and conform to the same interface is the right approach. Below are the issues I found, ordered by severity.


Issues

1. aot + rvr features are mutually exclusive but not enforced (High)

Both aot and rvr define their own instance(), interpreter(), metered_interpreter(), and metered_cost_interpreter() methods under separate #[cfg(feature = "...")] gates. If both features are enabled simultaneously, this produces duplicate method definitions and a compilation error. Neither Cargo.toml nor the code enforces mutual exclusion.

crates/vm/src/arch/vm.rs:439-454 — both define instance():

#[cfg(feature = "aot")]
pub fn instance(...) -> Result<AotInstance<F, ExecutionCtx>, ...>

#[cfg(feature = "rvr")]
pub fn instance(..., executor_idx_to_air_idx: &[usize]) -> Result<RvrPureInstance<F>, ...>

Consider either:

  • Adding a compile_error! when both are enabled
  • Using #[cfg(all(feature = "rvr", not(feature = "aot")))] guards

Fix this →

2. RvrPureInstance::execute discards hint_stream and deferrals (Medium)

crates/vm/src/arch/rvr/pure.rs:40-41 — The fresh-start execute path extracts only input_stream and drops hint_stream and deferrals:

let inputs = inputs.into();
let input_stream = inputs.input_stream;
// hint_stream and deferrals are silently dropped

The execute_from_state counterpart (line 87) properly extracts all three via streams_to_io_seed. If a caller passes a Streams with a non-empty hint_stream, it will be silently ignored. This matches the existing execute() helper, so it may be intentional, but it's worth adding a debug assertion or documenting the invariant.

3. Breaking API change on instance() signature (Medium)

crates/vm/src/arch/vm.rs:447-454 — The RVR instance() takes an extra executor_idx_to_air_idx parameter that the non-RVR and aot versions don't have:

pub fn instance(&self, exe: &VmExe<F>, executor_idx_to_air_idx: &[usize])

This means downstream code that calls .instance(exe) will break when the rvr feature is enabled. The test harness (rvr-openvm-test-utils/src/lib.rs:240) had to be updated to pass the extra arg. The SDK workaround (crates/sdk/src/lib.rs:319-338) is also notably complex — routing through app_prover just to get the AIR index map.

Consider whether instance() could internally compute executor_idx_to_air_idx (similar to how VirtualMachine::interpreter() does at line 785), keeping the API consistent across backends.

Fix this →

4. Recompilation on every execute_metered_from_state / execute_metered_cost_from_state call (Medium)

crates/vm/src/arch/rvr/metered.rs:827-833 and crates/vm/src/arch/rvr/metered_cost.rs:283-290 — Both _from_state methods re-run compile_metered / compile_metered_cost on every invocation:

let compiled_metered = if self.extensions.is_empty() {
    compile_metered(self.exe.as_ref(), &chips)
} else {
    compile_metered_with_extensions(self.exe.as_ref(), &self.extensions, &chips)
}

Compilation is expensive (JIT). If these methods are called in a loop (e.g., per-segment execution), this will be a significant performance bottleneck. The non-from-state execute_metered path has the same pattern but is presumably called once. Consider caching the compiled artifact in the instance struct if the config doesn't change between calls.

5. copy_guest_memory_to_rvr_memory uses as_ptr for both read and write (Low)

crates/vm/src/arch/vm.rs:258:

let dst = unsafe { std::slice::from_raw_parts_mut(rvr_memory.as_ptr(), rvr_memory.size()) };

as_ptr() returns a *const u8 (immutable pointer). Casting it to a mutable slice via from_raw_parts_mut without going through as_mut_ptr() (if GuardedMemory provides one) is technically UB per Rust's aliasing rules. If GuardedMemory has an as_mut_ptr() method, use it here instead.

6. Repeated boilerplate across _from_state methods (Low)

The execute_from_state, execute_metered_from_state, and execute_metered_cost_from_state methods share nearly identical patterns:

  • Extract pc, guest_memory, streams, rng, metrics from VmState
  • Allocate GuardedMemory, init rvr state, copy memory
  • Build io_state, set hint/pv/rng
  • Execute, check outcome, write back

This is ~40 lines of setup code repeated 3 times. Consider extracting common helpers (e.g., prepare_rvr_from_state / finalize_rvr_to_state).

7. Duplicate create_rvr_extensions on SdkVmConfig (Nit)

crates/sdk-config/src/lib.rs:325-331 and crates/sdk-config/src/lib.rs:344-353create_rvr_extensions is implemented twice: once as the VmExecutionConfig trait method, and again as an inherent method on the impl SdkVmConfig block. Both delegate to self.to_inner().create_rvr_extensions(air_idx). The inherent method appears to be leftover from the old VmRvrExtension trait pattern and should be removed.

Fix this →

8. Error variant reuse for unrelated errors (Nit)

crates/vm/src/arch/vm.rs:160-173 — Both map_rvr_compile_error and map_rvr_execute_error map to StaticProgramError::FailToGenerateDynamicLibrary. This was originally for AOT dynamic library errors. Using it for RVR compile/execute failures conflates different error sources. Consider adding an RVR-specific error variant.


Positive Notes

  • The derive macro simplification (removing VmRvrExtension trait and folding into VmExecutionConfig) is a good cleanup — reduces generated code and trait-bound complexity.
  • The ensure_rvr_outcome helper centralizes termination/suspension checking nicely.
  • The SegmentationCtx getter additions are clean and well-scoped.
  • Proper #[cfg(feature = "metrics")] handling throughout.

@github-actions

This comment has been minimized.

@gdmlcjs gdmlcjs force-pushed the feat/rvr-instance branch from dace11f to 8902898 Compare April 24, 2026 06:40
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

Claude finished @gdmlcjs's task in 1m 48s —— View job


Code Review: PR #2747 — feat: implement vm execution instance that uses rvr

  • Gather context and read changed files
  • Review VM core changes (crates/vm/)
  • Review RVR instance implementations (pure, metered, metered_cost)
  • Review SDK and config changes
  • Review derive macro changes
  • Review test utilities and extension changes
  • Post comprehensive review

Summary

This PR introduces three RVR-backed execution instance types (RvrPureInstance, RvrMeteredInstance, RvrMeteredCostInstance) that mirror the existing interpreter-based instances. It also consolidates create_rvr_extensions into the VmExecutionConfig trait (removing the standalone VmRvrExtension trait), fixes the GuardedMemory::as_ptr mutability, and adds hint_stream support to all RVR execute functions.

The overall architecture is clean — the derive macro simplification is a welcome cleanup, and the new instances properly implement _from_state variants for continuation support. Below are the issues I found.


Issues

1. aot + rvr features produce duplicate methods — no enforcement of mutual exclusion (High)

crates/vm/src/arch/vm.rs defines instance(), metered_instance(), metered_cost_instance(), interpreter(), metered_interpreter(), and metered_cost_interpreter() under separate #[cfg(feature = "aot")] / #[cfg(feature = "rvr")] gates. If both features are enabled simultaneously, the compiler sees duplicate method definitions with the same name.

The #[cfg] gates don't use not(feature = ...) exclusion (unlike the no-backend fallbacks which use #[cfg(all(not(feature = "aot"), not(feature = "rvr")))]). Consider adding to crates/vm/src/lib.rs:

#[cfg(all(feature = "aot", feature = "rvr"))]
compile_error!("features `aot` and `rvr` are mutually exclusive");

Fix this →

2. SDK execute() needs full keygen for pure execution when rvr is enabled (Medium)

crates/sdk/src/lib.rs:319-329 — The RVR path routes through self.app_prover(app_exe)? just to get the executor_idx_to_air_idx mapping. app_prover performs a full keygen (creating a VirtualMachine with proving key), which is expensive for a pure execution call that doesn't need a proof. This is a regression from the non-rvr path which just calls self.executor.instance(&exe).

Consider caching the AIR index mapping at Sdk construction time or providing a lighter-weight path.

3. Recompilation on every _from_state call (Medium)

crates/vm/src/arch/rvr/metered.rs:833-839 and crates/vm/src/arch/rvr/metered_cost.rs:287-292 — Both execute_metered_from_state and execute_metered_cost_from_state call compile_metered / compile_metered_cost on each invocation. These are JIT compilation steps that produce shared libraries. In a multi-segment execution loop, this means one JIT compilation per segment.

The non-from-state variants have the same pattern but are presumably called once. Consider caching the RvrCompiled in the instance struct if the config doesn't change between calls, or compiling once in the constructor.

4. RvrPureInstance::executehint_stream field conversion may lose data (Low)

crates/vm/src/arch/rvr/pure.rs:42-46 — The hint stream is converted via f.as_canonical_u32() as u8, which truncates any value above 255. The same pattern exists in streams_to_io_seed at crates/vm/src/arch/vm.rs:284-288. This is consistent across the codebase but worth documenting that RVR hint streams only support byte-valued field elements.

5. Error variant reuse for unrelated errors (Low)

crates/vm/src/arch/vm.rs:160-164map_rvr_compile_error maps to StaticProgramError::FailToGenerateDynamicLibrary, which was originally for the AOT backend's dynamic library generation. While the description fits, it conflates two different backends' error sources. The new ExecutionError::RvrExecution variant added at crates/vm/src/arch/execution.rs:84-86 is good — consider adding a corresponding StaticProgramError::RvrCompilation variant for compile-time errors.

6. Large #[cfg] conditional method explosion in vm.rs (Low / Maintainability)

crates/vm/src/arch/vm.rs:417-611 and 746-944 — There are now 6 separate #[cfg] blocks defining instance(), metered_instance(), metered_cost_instance(), plus 6 more for the VirtualMachine wrapper methods (interpreter(), metered_interpreter(), metered_cost_interpreter()). Each backend (none/aot/rvr) adds its own copy.

This is getting unwieldy. Consider a trait-based approach or an internal dispatch method that the public API delegates to, reducing the cfg proliferation. Not blocking, but the current pattern won't scale well if additional backends are added.

7. Repeated boilerplate in _from_state methods (Nit)

The execute_from_state, execute_metered_from_state, and execute_metered_cost_from_state methods share ~40 lines of identical setup: extract streams, allocate GuardedMemory, init state, copy memory, build IO state, set hint/pv/rng. Extracting helpers like prepare_rvr_from_state / finalize_rvr_to_state would reduce duplication.


Positive Notes

  • Derive macro cleanup: Removing VmRvrExtension trait and folding create_rvr_extensions into VmExecutionConfig is a good simplification — fewer trait bounds, less generated code, and a cleaner extension interface.
  • GuardedMemory::as_ptr / as_mut_ptr split (in crates/rvr/rvr-state/src/memory.rs): Properly separating read-only and mutable pointer access fixes the aliasing concern. The previous review's issue chore: create afs-chips crate #5 about UB is now resolved.
  • hint_stream support: Adding the hint_stream parameter to all execute_* functions is needed for correct execution of programs that use hint intrinsics.
  • ensure_rvr_outcome helper centralizes termination/suspension checking.
  • New SegmentationCtx getters (widths(), interactions(), config(), segment_check_insns()) are well-scoped additions that enable the RVR metered instance to access segmentation config without duplicating it.

@github-actions
Copy link
Copy Markdown

group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 3,825 12,000,265 952
keccak 18,593 18,655,329 3,311
sha2_bench 9,003 14,793,960 1,408
regex 1,419 4,137,067 377
ecrecover 652 123,583 273
pairing 902 1,745,757 284
kitchen_sink 2,086 2,579,903 437

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: 8902898

Benchmark Workflow

@gdmlcjs gdmlcjs requested a review from shuklaayush April 24, 2026 14:38
@gdmlcjs gdmlcjs marked this pull request as ready for review April 24, 2026 14:38
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

Claude finished @gdmlcjs's task in 2m 26s —— View job


Code Review: PR #2747 — feat: implement vm execution instance that uses rvr

  • Gather context and read changed files
  • Review VM core changes (crates/vm/)
  • Review RVR instance implementations (pure, metered, metered_cost)
  • Review derive macro changes
  • Review SDK and config changes
  • Review test utilities and extension changes
  • Post comprehensive review

Summary

This PR introduces three RVR-backed execution instance types (RvrPureInstance, RvrMeteredInstance, RvrMeteredCostInstance) that mirror the existing interpreter-based instances but use the RVR JIT backend. It also consolidates create_rvr_extensions into VmExecutionConfig (removing a standalone VmRvrExtension trait), simplifies the derive macro accordingly, and adds _from_state continuation support for all instance types.

The architecture is sound — the new instances properly wrap the RVR compile/execute pipeline and conform to the same state interface as the interpreted path. Below are the issues found, ordered by severity.


Issues

1. state_from_rvr discards streams, rng, and metrics — stale state on execute_with_limit (High)

crates/vm/src/arch/vm.rs:176-188state_from_rvr always constructs a VmState with Streams::default() and rng=0:

VmState::new_with_defaults(pc, guest_memory, Streams::default(), 0)

This is called from RvrPureInstance::execute (line 58 for the limited path, line 77 for the unlimited path), RvrMeteredInstance::execute_metered (line 787), and RvrMeteredCostInstance::execute_metered_cost (line 254).

For the unlimited execute() path this is likely fine — the program ran to completion, so streams are consumed and rng doesn't matter. But for the limited path (execute_with_limit, line 49-66 in pure.rs), execution may be suspended mid-program. The returned VmState will have empty streams and a zeroed rng, meaning any subsequent execute_from_state call using this result will start with no input/hint data and a reset random seed. This is a correctness issue for callers that use execute with num_insns=Some(...) and then resume execution.

The _from_state methods correctly preserve streams/rng (e.g., pure.rs:137-144), so the fix for the limited path in execute() would be to either:

  • Pass the remaining io_state and rng through state_from_rvr, or
  • Build the return VmState manually (like execute_from_state does) instead of calling state_from_rvr.

Fix this →

2. SDK execute() routes through full keygen when rvr is enabled (Medium)

crates/sdk/src/lib.rs:319-329 — The RVR path calls self.app_prover(app_exe)? solely to obtain executor_idx_to_air_idx from the VirtualMachine. app_prover performs a full keygen (creating the proving key), which is expensive for a pure execution call that doesn't need a proof.

#[cfg(feature = "rvr")]
let instance = {
    let app_prover = self.app_prover(app_exe)?;
    let exe = app_prover.exe();
    app_prover.vm().interpreter(&exe).map_err(VirtualMachineError::from)?
};

The non-rvr path just calls self.executor.instance(&exe) — no keygen needed. Consider caching the AIR index mapping or computing it without a full keygen pass.

3. JIT recompilation on every _from_state call (Medium)

crates/vm/src/arch/rvr/metered.rs:833-839 and crates/vm/src/arch/rvr/metered_cost.rs:287-292 — Both execute_metered_from_state and execute_metered_cost_from_state call compile_metered / compile_metered_cost on each invocation. This is JIT compilation that produces shared libraries. In a multi-segment execution loop, this means one JIT compilation per segment.

let compiled_metered = if self.extensions.is_empty() {
    compile_metered(self.exe.as_ref(), &chips)
} else {
    compile_metered_with_extensions(self.exe.as_ref(), &self.extensions, &chips)
}

The non-_from_state variants have the same pattern, but those are called once. Consider caching the RvrCompiled in the instance struct if the config doesn't change between calls, similar to how RvrPureInstance stores its compiled field.

4. RvrPureInstance::execute limited path returns state_from_rvr which re-creates memory from init image (Medium)

crates/vm/src/arch/rvr/pure.rs:58-66 — After execute_with_limit, the code calls state_from_rvr which re-creates guest memory from the EXE init image via create_memory_image (vm.rs:184) and then overwrites it with the RVR memory. This double-initializes memory — first from the init image, then from the actual rvr state — which is wasteful. More importantly, the init image write may set cells that the RVR execution modified, then the write_rvr_memory_to_guest_memory call overwrites them with the correct values. This works but is fragile and unnecessarily expensive.

The execute_from_state methods don't have this issue because they operate on an existing GuestMemory.

5. Error variant reuse — StaticProgramError::FailToGenerateDynamicLibrary used for RVR compile errors (Low)

crates/vm/src/arch/vm.rs:160-164map_rvr_compile_error maps to StaticProgramError::FailToGenerateDynamicLibrary, originally named for the AOT backend. Using it for RVR compile errors conflates different backends. The new ExecutionError::RvrExecution variant (line 86) for runtime errors is good — consider adding a parallel StaticProgramError::RvrCompilation variant for compile-time errors to improve error diagnostics.

6. Repeated boilerplate across _from_state methods (Low)

execute_from_state (pure.rs), execute_metered_from_state (metered.rs), and execute_metered_cost_from_state (metered_cost.rs) share ~40 lines of identical setup:

  • Extract pc, guest_memory, streams, rng, metrics from VmState
  • Allocate GuardedMemory, init state, copy memory
  • Build io_state, set hint/pv/rng
  • Execute, check outcome, write back

Consider extracting common helpers (e.g., prepare_rvr_from_state / finalize_rvr_to_state) to reduce duplication and lower the chance of them drifting out of sync.

7. metered.rs should_segment interaction count formula differs from SegmentationCtx (Nit)

crates/vm/src/arch/rvr/metered.rs:525-530 vs segment_ctx.rs:288-295 — The RVR version computes total interactions as:

(h + 1) as usize * i  // h = raw height (not padded)

while the canonical SegmentationCtx::calculate_total_interactions at segment_ctx.rs:288-295 also uses (height + 1) * interactions. However, the should_segment in segment_ctx.rs:327-351 iterates using padded_height (next_power_of_two) for the main/interaction count, but then separately calls calculate_total_interactions which uses raw heights. The RVR version's should_segment uses raw heights for interactions too, which appears consistent, but the main_cnt loop at line 507 uses padded_height — this matches.

One subtle difference: the RVR should_segment computes memory at line 512 but doesn't call ceil_weighted_bytes — it uses .ceil() as usize directly. This should produce the same result but is slightly different code that could diverge over time. Consider reusing the shared helper or at least adding a note that these must stay in sync.


Positive Notes

  • compile_error! guard (crates/vm/src/lib.rs:11-12) for aot+rvr mutual exclusion is already in place — the previous reviews flagged this but it was already resolved.
  • GuardedMemory::as_mut_ptr (rvr-state/src/memory.rs:93-95) properly separates read/mutable pointer access, resolving the previous review's UB concern about as_ptr being used for writes.
  • Derive macro cleanup: Removing VmRvrExtension trait and folding create_rvr_extensions into VmExecutionConfig is a good simplification — fewer traits, less generated code, and cleaner extension composition.
  • ensure_rvr_outcome helper centralizes termination/suspension checking across all three instance types.
  • SegmentationCtx getter additions (widths(), interactions(), config(), segment_check_insns()) are clean, well-scoped additions that give the RVR metered instance access to segmentation config without field duplication.
  • Hint stream support: Properly threading hint_stream through all execute functions and _from_state variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant