Skip to content

feat: enable rvr execution of benchmarks#2753

Open
gdmlcjs wants to merge 1 commit intofeat/rvr-instancefrom
feat/rvr-benchmarks
Open

feat: enable rvr execution of benchmarks#2753
gdmlcjs wants to merge 1 commit intofeat/rvr-instancefrom
feat/rvr-benchmarks

Conversation

@gdmlcjs
Copy link
Copy Markdown

@gdmlcjs gdmlcjs commented Apr 24, 2026

Enables running benchmarks through rvr extension. Benchmark tests do not check execution correctness and currently execution involving extensions other than RV32IM fail because VmRvrExtension trait implementation is not properly wired.

closes INT-7480

@gdmlcjs gdmlcjs requested a review from shuklaayush April 24, 2026 22:04
@gdmlcjs gdmlcjs self-assigned this Apr 24, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

Claude finished @gdmlcjs's task in 3m 3s —— View job


Code Review: PR #2753 — feat: enable rvr execution of benchmarks

  • Gather context and read changed files
  • Review benchmark execution logic
  • Review RVR metered changes
  • Post review feedback

Overall this is a clean, well-structured change that follows the existing patterns (AOT caching, feature-gated branches) closely. A few observations:

Feature flag structure looks correct

The priority chain aot > rvr > interpreted is consistent across benchmark_execute and benchmark_execute_metered. For benchmark_execute_metered_cost, the simpler rvr / not(rvr) split is fine because the underlying metered_cost_instance() API in vm.rs is itself gated #[cfg(not(feature = "rvr"))] (covering both aot and default modes). No issue here.

The rvr feature in Cargo.toml correctly propagates to all extension circuit crates — this was the missing piece that the PR description mentions.

Nits / suggestions

  1. rvr-openvm-lift dependency may be unnecessary (Cargo.toml:30, Cargo.toml:68)
    The dep:rvr-openvm-lift is listed in the rvr feature and added as an optional dependency, but it's never directly imported anywhere in the benchmark crate (no use rvr_openvm_lift::... in any .rs file). It appears to be purely transitive through openvm-circuit/rvr. If it's only needed to ensure the dependency is pulled in, the sub-crate feature flags (openvm-circuit/rvr, etc.) should already handle that. Consider removing it unless there's a reason it needs to be a direct dependency.

  2. Duplicated VM setup boilerplate (execute.rs:386-394 vs execute.rs:414-423 vs execute.rs:443-446)
    The three create_*_rvr_instance functions each re-derive executor_idx_to_air_idx by constructing a VirtualMachine. create_rvr_instance and create_metered_rvr_instance create a full VirtualMachine just for this mapping, while create_metered_cost_rvr_instance pulls it from metered_cost_setup(). Consider extracting the executor_idx_to_air_idx computation into a shared OnceLock (similar to how metered_cost_setup already caches it) so you don't spin up redundant VMs. This follows the existing pattern — metered_cost_setup() already caches (MeteredCostCtx, Vec<usize>) in a OnceLock.

  3. TODO comments in core crate are a nice callout (metered.rs:759-760, metered_cost.rs:231-232)
    The TODOs about hoisting compilation to instance construction are good breadcrumbs. These are performance-relevant for benchmarks since compile_metered / compile_metered_cost are called on every execution rather than once at instance creation time. Worth tracking as a follow-up if this becomes a bottleneck in benchmark measurements.

  4. Minor: benchmark_execute_metered_cost is ignore = true (execute.rs:552)
    This benchmark is annotated with ignore = true in the #[divan::bench] attribute, meaning the newly added RVR path won't actually run unless explicitly un-ignored. This is fine if intentional (the original was also ignored), but worth noting that the new RVR path hasn't been exercised via CI benchmarks yet.

No issues found

  • Feature gate logic is sound and consistent with the existing AOT pattern
  • The #[cfg] gates don't create dead code or unreachable paths
  • Cache + Arc pattern for instance reuse across benchmark iterations is correct
  • The MeteredCtx.clone() in bench closures matches how the AOT variant handles it

Looks good to merge.

@github-actions
Copy link
Copy Markdown

group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 3,870 12,000,265 960
keccak 18,580 18,655,329 3,307
sha2_bench 9,029 14,793,960 1,397
regex 1,433 4,137,067 376
ecrecover 642 123,583 271
pairing 902 1,745,757 285
kitchen_sink 2,090 2,579,903 436

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: 0c3ad8c

Benchmark Workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant