test(DON'T MERGE): develop-v2.1.0 rv64#2719
test(DON'T MERGE): develop-v2.1.0 rv64#2719shuklaayush wants to merge 48 commits intodevelop-v2.0.0-rc.1from
Conversation
|
Claude finished @shuklaayush's task in 5m 13s —— View job Code Review: RV64 Support PR
OverviewThis PR adds RV64IM support to OpenVM across 302 files. The major changes are:
Bugs1.
|
bc66c89 to
7672360
Compare
|
Claude finished @shuklaayush's task in 4m 48s —— View job Code Review: RV64 Support PR (Re-review after sync)
OverviewRe-reviewing after the latest Previous Bug #2 (hint stream length encoding) was fixed — now correctly uses Bugs (Still Open)1.
|
| Priority | Issue | Impact |
|---|---|---|
| High | #1: config.rv32i → config.rv64i |
Blocks compilation under cuda/aot features |
| High | #2-5: AOT layer still assumes 32-bit registers | Entire AOT code path is broken for RV64 |
| Medium | #4: CUDA extension not ported | Blocks GPU proving (likely tracked separately) |
| Low | #6: RV32_*_AS naming in execution code |
Cosmetic, functionally correct |
| Low | #7: Inner guest/guest/ directory |
Cleanup needed, no runtime impact |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid. The main execution, circuit, and transpiler changes are well-structured. The primary risk is the AOT layer, which appears to still assume 4-limb (32-bit) registers throughout.
|
Claude finished @shuklaayush's task in 4m 47s —— View job Code Review: RV64 Support PR (Re-review after sync)
OverviewRe-reviewing after the latest sync. The PR is at 313 files changed, 15544 insertions, 6823 deletions across 15 commits. The new commit Previously fixed issues:
Bugs (Still Open)1.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
299 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
419 |
All are behind #[cfg(feature = "aot")] or #[cfg(feature = "cuda")], so they don't break default builds, but will fail with those features enabled.
2. AOT load_sign_extend impl is dead code — wrong const generic
load_sign_extend/aot.rs:20-21 implements AotExecutor for LoadSignExtendExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> (4 limbs), but the actual type alias Rv64LoadSignExtendExecutor uses RV64_REGISTER_NUM_LIMBS (8 limbs) — see load_sign_extend/mod.rs:24-25. The AOT impls (lines 20 and 41) will never be dispatched for any RV64 LoadSignExtendExecutor.
Same pattern applies to the metered impl at line 40-41.
3. AOT register stride hardcodes /4 instead of /8
The entire AOT x86 code generation layer maps RISC-V register addresses to x86 XMM registers using a 4-limb stride (a / 4). With RV64's 8-limb registers, these produce wrong XMM indices:
load_sign_extend/aot.rs:115-116—a / 4,b / 4loadstore/aot.rs:150-151—a / 4,b / 4branch_lt/aot.rs:40-41—a / 4,b / 4auipc/execution.rs:106—a / 4branch_eq/execution.rs:131-132—a / 4,b / 4jal_lui/execution.rs:114—a / 4divrem_w/execution.rs:121-123—a / 4,b / 4,c / 4- All
*_Wexecution files (base_alu_w,shift_w,mul_w) use same pattern
The helper gpr_to_rv32_register (common/mod.rs:25-31) and xmm_to_gpr (common/mod.rs:61-85) also assume 2 registers per XMM (the RV32 4-limb layout). For RV64, a register like x5 (byte offset 40 = 5×8) would map to 40/4 = 10 → xmm5, instead of the correct 40/8 = 5 → xmm2.
This is a systemic issue — the AOT layer needs a fundamental redesign for 64-bit register values since XMM elements are 32-bit (pinsrd/pextrd). The CI workflow commenting (chore(ci): comment out aot workflows) suggests this is a known deferral.
4. AOT metered code uses RV32_REGISTER_AS in RV64 *_W chips
The metered path in *_W execution files calls update_adapter_heights_asm(config, RV32_REGISTER_AS):
divrem_w/execution.rs:292-296base_alu_w/execution.rs:251-256shift_w/execution.rs:250-255mul_w/execution.rs:207-211
While RV32_REGISTER_AS == RV64_REGISTER_AS == 1 (functionally identical), update_adapter_heights_asm in common/mod.rs:107 also prints error messages containing "RV32_REGISTER_AS", which is misleading for RV64.
5. CUDA extension references old Rv32* types
extension/cuda.rs:11-18 imports Rv32BaseAluAir, Rv32LoadStoreAir, Rv32JalLuiAir, Rv32HintStoreAir, etc. — all of which were renamed to Rv64* in this PR. Every GPU chip construction (lines 39-113) uses the wrong types. Also missing: new *_W chip registrations for GPU backend.
This is clearly a known deferral for this development branch.
Code Quality Issues
6. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ still uses:
cfg(target_os = "zkvm")guard (old) vs the outer'scfg(openvm_intrinsics)(new)hint_buffer_u32!(RV32) vs the outer'shint_buffer_u64!(RV64)RV32_ALU_OPCODEvs the outer'sRV64_ALU_OPCODE
The workspace points to the outer directory. The inner one is a confusing leftover and should be removed.
7. RV32_CELL_BITS in auipc/execution.rs:100
The AUIPC AOT code computes rd = pc.wrapping_add((c as u32) << RV32_CELL_BITS). Since RV32_CELL_BITS == RV64_CELL_BITS == 8, this is functionally correct but the naming is misleading in an RV64 codebase.
Architecture Observations (Positive)
- Decoder crate is well-structured with correct RV64 shift amount handling (6-bit
shamtviafunct6for 64-bit shifts, 5-bit for W-suffix withshamt < 32validation) - Phantom sub-executors properly updated:
read_rv64_register, 8-byte hint stream alignment,u64::to_le_bytes()for hint lengths - Toolchain migration is thorough: serializer/deserializer use
WORD_SIZE=8, platform memory layout uses 8-byte alignment,read_u64()andread_vec_by_len()correctly handle dword boundaries *_Wadapter/core pattern is well-designed: adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension of 32-bit results back to 64-bit registers- Non-AOT execution paths (base_alu, shift, less_than, branch_eq, branch_lt, loadstore, hintstore) correctly use
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS, andvm_read::<u8, 8> - Test coverage for new W-chips is thorough with proper sign extension edge cases and negative tests for constraint violations
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot features |
Open |
| High | #2-3: AOT layer assumes 32-bit registers | Entire AOT code path broken for RV64 | Known deferral (CI disabled) |
| Medium | #5: CUDA extension not ported | GPU proving broken | Known deferral |
| Low | #4: RV32_REGISTER_AS in *_W metered paths |
Functionally correct, misleading naming | Open |
| Low | #6: Inner guest/guest/ directory |
Confusing leftover, no runtime impact | Open |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid. The execution, circuit, transpiler, and toolchain changes are well-structured and consistent. The *_W chip design with split-width adapters is clean. The remaining issues are confined to the AOT/CUDA layers, which are acknowledged as not yet ported.
Recommended before merge:
- Fix
config.rv32i→config.rv64iin the 5 locations (trivial) - Remove the inner
guest/guest/directory - Track AOT and CUDA port as separate work items
|
Claude finished @shuklaayush's task in 5m 46s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously fixed issues (confirmed):
Bugs (Still Open)1.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
299 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
419 |
All are behind #[cfg(feature = "cuda")] or #[cfg(feature = "aot")], so they don't break default builds but will fail with those features.
2. AOT load_sign_extend impl is dead code — wrong const generic
load_sign_extend/aot.rs:20-21 implements traits for LoadSignExtendExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> (4 limbs), but the actual type alias Rv64LoadSignExtendExecutor uses RV64_REGISTER_NUM_LIMBS (8 limbs) — see load_sign_extend/mod.rs:24-25. Same for metered impl at line 40-41. These impls will never be dispatched for any RV64 type.
3. AOT layer assumes 32-bit registers throughout
The AOT x86 code generation layer is fundamentally broken for RV64:
- Register stride:
a / 4hardcoded inload_sign_extend/aot.rs:115,loadstore/aot.rs:150,branch_lt/aot.rs:40, and all*_Wexecution files. With RV64's 8-byte registers, these produce wrong XMM register indices. - XMM helpers:
gpr_to_rv32_registerandxmm_to_gprincommon/mod.rs:25-85assume 2 RISC-V registers per XMM (32-bit × 2 = 64-bit XMM), but RV64 registers are 64 bits each — need 1:1 mapping or use different SSE/AVX storage. *_Wexecution AOT impls:base_alu_w/execution.rs,shift_w/execution.rs,mul_w/execution.rs,divrem_w/execution.rs— all referenceRV32_REGISTER_AS,RV32_IMM_AS,RV32_REGISTER_NUM_LIMBSin their AOT trait implementations.update_adapter_heights_asmincommon/mod.rs:107-117— error messages say "RV32_REGISTER_AS" and "RV32_MEMORY_AS".
The CI disabling (chore(ci): comment out aot workflows) acknowledges this. The entire AOT layer needs a redesign for 64-bit values since the x86 XMM register mapping is fundamentally 32-bit.
4. CUDA extension references old Rv32* types
extension/cuda.rs:11-18 imports Rv32BaseAluAir, Rv32LoadStoreAir, Rv32JalLuiAir, Rv32HintStoreAir, etc. — all renamed to Rv64*. All GPU chip files (*/cuda.rs) also use RV32_REGISTER_NUM_LIMBS and RV32_CELL_BITS. Also missing: new *_W chip registrations for GPU backend.
This is clearly a known deferral for this development branch.
Code Quality Issues
5. Serializer Vec::with_capacity over-allocates
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. A 64-byte value allocates capacity for 64 u64 words (512 bytes) instead of 8 u64 words (64 bytes). This was the same pattern with Vec<u32> in RV32 (4× over-allocation → now 8× over-allocation). Functionally correct but wastes memory.
Fix: Vec::with_capacity(core::mem::size_of_val(value) / WORD_SIZE) or just Vec::new().
6. Hardcoded 8 instead of WORD_SIZE in io/mod.rs
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;Should use WORD_SIZE constant for consistency, as done elsewhere in the codebase.
7. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code:
- Uses
hint_buffer_u32!(RV32) vs the outer'shint_buffer_u64!(RV64) - Uses
cfg(target_os = "zkvm")(old) vscfg(openvm_intrinsics)(new) - Uses
RV32_ALU_OPCODEvsRV64_ALU_OPCODE
The workspace points to the outer directory. The inner one should be removed.
8. cargo doc disabled in CI
lints.yml comments out the cargo doc step, suggesting docs may not build cleanly. This should be tracked and re-enabled.
Architecture Observations (Positive)
-
Decoder crate is well-structured. RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32) are correctly implemented.ITypeShamtproperly masks to 6 bits (0x3f). -
ELF loader correctly validates
Class::ELF64, reads instructions in 4-byte chunks (RISC-V instructions are always 32-bit), and constrains the address space to u32. -
Adapter/core pattern is well maintained. The
*_Wadapters (Rv64BaseAluWAdapter,Rv64MultWAdapter) correctly read full 8-byte registers, expose lower 4 bytes to cores, and handle sign extension via XOR-based sign bit extraction. -
Address space encoding is correct:
rs2_asconstrained to boolean maps directly toRV64_IMM_AS=0/RV64_REGISTER_AS=1. -
Sign extension is correctly implemented across all W-suffix chips using the
(result as i32 as i64 as u64)chain. -
Transpiler utility functions (
from_r_type,from_i_type, etc.) correctly useRV64_REGISTER_NUM_LIMBSfor register address scaling. -
Phantom sub-executors properly use
read_rv64_register(), 8-byte hint stream alignment, andu64::to_le_bytes(). -
Test coverage for new W-chips is thorough with sign extension edge cases, negative tests for constraint violations, and comprehensive boundary value testing.
-
Non-AOT, non-CUDA execution paths (base_alu, shift, less_than, branch_eq, branch_lt, loadstore, hintstore) correctly use
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS, andvm_read::<u8, 8>.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #2-3: AOT layer assumes 32-bit registers | Entire AOT code path broken | Known deferral (CI disabled) |
| Medium | #4: CUDA extension not ported | GPU proving broken | Known deferral |
| Low | #5: Serializer over-allocates capacity | Wasted memory | Minor |
| Low | #6: Hardcoded 8 vs WORD_SIZE |
Style inconsistency | Minor |
| Low | #7: Inner guest/guest/ directory |
Confusing leftover | Cleanup |
| Low | #8: cargo doc disabled |
Docs may not build | Track |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid. The execution, circuit, transpiler, and toolchain changes are well-structured and consistent. The *_W chip design with split-width adapters is clean. The decoder crate properly handles all RV64 instruction encoding differences.
Recommended before merge:
- Fix
config.rv32i→config.rv64iin the 5 locations (trivial, blocks feature-gated compilation) - Remove the inner
guest/guest/directory (confusing leftover) - Track AOT and CUDA port as separate work items with issues
|
Claude finished @shuklaayush's task in 4m 41s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously fixed issues (confirmed):
Bugs (Still Open)1.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
Additionally, test files behind #[cfg(feature = "cuda")] or #[cfg(feature = "aot")]:
| File | Line |
|---|---|
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
2. AOT impls reference non-existent Rv32* executor types (Compilation error under aot feature)
Several AotExecutor and AotMeteredExecutor impls implement traits for types that were renamed to Rv64* but the AOT blocks still use the old names:
| File | Line | Has Rv32... |
Should be Rv64... |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
These are dead code — the impls won't be found for the actual Rv64* types. They'll fail to compile under aot feature since the Rv32* structs don't exist.
3. AOT load_sign_extend impl uses wrong const generic — dead code
load_sign_extend/aot.rs:20-21 implements traits for LoadSignExtendExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> (4 limbs), but the actual type alias Rv64LoadSignExtendExecutor uses RV64_REGISTER_NUM_LIMBS (8 limbs) — see load_sign_extend/mod.rs:24-25. Same for metered impl at line 41. These impls will never be dispatched for any RV64 type.
4. AOT register stride hardcodes /4 instead of /8
The AOT x86 code generation layer maps RISC-V register addresses to XMM registers using a 4-limb stride. With RV64's 8-byte registers, all register index calculations produce wrong results:
jal_lui/execution.rs:109—a / 4auipc/execution.rs:106—a / 4jalr/execution.rs:114—a / 4,b / 4load_sign_extend/aot.rs:115-116—a / 4,b / 4loadstore/aot.rs:150-151—a / 4,b / 4branch_lt/aot.rs:40-41—a / 4,b / 4- All
*_Wexecution files (base_alu_w,shift_w,mul_w,divrem_w)
The helper gpr_to_rv32_register (common/mod.rs:25-31) and xmm_to_gpr (common/mod.rs:61-85) also assume the RV32 2-registers-per-XMM layout. The entire AOT layer needs a fundamental redesign for 64-bit register values. The CI disabling (chore(ci): comment out aot workflows) acknowledges this.
5. CUDA extension references old Rv32* types
extension/cuda.rs:11-18 imports Rv32BaseAluAir, Rv32LoadStoreAir, Rv32JalLuiAir, Rv32HintStoreAir, etc. — all renamed to Rv64*. All GPU chip files (*/cuda.rs) also use RV32_REGISTER_NUM_LIMBS and RV32_CELL_BITS. Also missing: new *_W chip registrations. This is clearly a known deferral.
6. *_W AOT impls use RV32_REGISTER_NUM_LIMBS for generic param
The AOT AotExecutor and AotMeteredExecutor impls for the W-suffix chips use RV32_REGISTER_NUM_LIMBS (4) in their generic parameter:
| File | Lines |
|---|---|
base_alu_w/execution.rs |
166, 234 |
shift_w/execution.rs |
(same pattern) |
mul_w/execution.rs |
(same pattern) |
divrem_w/execution.rs |
108, 273 |
These implement BaseAluExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> but the actual types use RV64_REGISTER_NUM_LIMBS. These impls will never be dispatched.
Code Quality Issues
7. Serializer Vec::with_capacity over-allocates
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. A 64-byte value allocates capacity for 64 u64 words (512 bytes) instead of 8 (64 bytes). This is an 8x over-allocation. Functionally correct but wasteful.
Fix: Vec::with_capacity(core::mem::size_of_val(value) / 8) or just Vec::new().
8. Hardcoded 8 instead of WORD_SIZE in io/mod.rs
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;Should use a named constant for consistency with the rest of the codebase.
9. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code:
- Uses
hint_buffer_u32!(RV32) vs the outer'shint_buffer_u64!(RV64) - Uses
cfg(target_os = "zkvm")(old) vscfg(openvm_intrinsics)(new)
The workspace points to the outer directory. The inner one should be removed.
10. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:302:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {Leftover temp function with debug marker — should be cleaned up or properly named before merge.
11. 202 remaining RV32_* constant references across 50 files
The extensions/riscv/circuit/src/ directory still has 202 occurrences of RV32_REGISTER_AS, RV32_MEMORY_AS, RV32_REGISTER_NUM_LIMBS, RV32_CELL_BITS, and RV32_IMM_AS spread across 50 files. Most are in AOT (*/aot.rs), CUDA (*/cuda.rs), and test code behind feature gates. While RV32_REGISTER_AS == RV64_REGISTER_AS == 1 and RV32_CELL_BITS == RV64_CELL_BITS == 8 (functionally identical), RV32_REGISTER_NUM_LIMBS (4) ≠ RV64_REGISTER_NUM_LIMBS (8) — so any reference to RV32_REGISTER_NUM_LIMBS in non-feature-gated code is a potential logic bug.
12. auipc/tests.rs:419 uses RV32_REGISTER_NUM_LIMBS (value bug)
let a = gen_pointer(&mut rng, RV32_REGISTER_NUM_LIMBS);This generates pointers with stride 4 instead of 8, which produces incorrect register addresses for RV64 in the CUDA test path.
Architecture Observations (Positive)
-
Decoder crate is well-structured. RV64 shift amounts (6-bit
shamtviafunct6, mask0x3f) and W-suffix validation (shamt < 32) are correctly implemented.ITypeShamtproperly handles both 5-bit and 6-bit shift amounts. -
ELF loader correctly validates
Class::ELF64, usestry_into()for u64→u32 conversions with proper error handling, and constrains addresses to u32. -
Non-AOT execution paths are solid —
base_alu,shift,less_than,branch_eq,branch_lt,loadstore,hintstore,jalr,jal_lui,auipcall correctly useRV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS, andvm_read::<u8, 8>. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension via(result as i32 as i64 as u64).to_le_bytes(), and constrain upper bytes through XOR-based sign bit extraction. -
imm_to_bytesandimm_to_u64(adapters/mod.rs:70,86) correctly handle 12-bit signed immediate sign extension to 64 bits via the 24-bit encoding. -
abstract_compose(adapters/mod.rs:292) was correctly fixed to usefrom_u64(1u64 << ...)to avoid overflow for RV64 register widths. -
Transpiler utility functions (
from_r_type,from_i_type, etc.) correctly useRV64_REGISTER_NUM_LIMBSfor register address scaling. -
Test coverage for the new chips is thorough — comprehensive sign extension edge cases, negative tests for constraint violations, boundary value testing.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #2: AOT impls reference Rv32*Executor types (8 impls across 4 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #3-4,6: AOT layer assumes 32-bit registers | Entire AOT code path broken | Known deferral (CI disabled) |
| Medium | #5: CUDA extension not ported | GPU proving broken | Known deferral |
| Low | #7: Serializer 8x over-allocation | Wasted memory | Minor |
| Low | #8: Hardcoded 8 vs named constant |
Style | Minor |
| Low | #9: Inner guest/guest/ directory |
Confusing leftover | Cleanup |
| Low | #10: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #11-12: RV32_* naming in 50 files |
Misleading, few are value bugs | Tracked |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid. The execution, circuit, transpiler, and toolchain changes are well-structured and consistent. The latest commit successfully fixes the abstract_compose overflow, updates algebra extension register reads, and applies proper clippy/lint fixes.
Recommended before merge:
- Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin AOT impls across 4 files (trivial, 8 occurrences) - Remove the inner
guest/guest/directory - Remove
TEMP[jpw]marker / renametmp_convert_to_u8s - Track AOT and CUDA port as separate work items
|
Claude finished @shuklaayush's task in 5m 27s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously fixed issues (confirmed):
Bugs (Still Open)1.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
All are behind #[cfg(feature = "cuda")] or #[cfg(feature = "aot")], so they don't break default builds but will fail with those features enabled.
2. AOT impls reference non-existent Rv32*Executor types (Compilation error under aot feature)
Several AotExecutor and AotMeteredExecutor impls are written for Rv32* struct names, but the actual struct types were renamed to Rv64*. These are dead code — the impls apply to types that don't exist:
| File | Lines | Has Rv32... |
Should be Rv64... |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor<A> |
Rv64JalLuiExecutor<A> |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor<A> |
Rv64AuipcExecutor<A> |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor<A> |
Rv64JalrExecutor<A> |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
3. AOT load_sign_extend impl uses wrong const generic — dead code
load_sign_extend/aot.rs:20-21 implements traits for LoadSignExtendExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> (4 limbs), but the actual type alias Rv64LoadSignExtendExecutor uses RV64_REGISTER_NUM_LIMBS (8 limbs) — see load_sign_extend/mod.rs:24-25. Same for metered impl at line 40-41. These impls will never be dispatched for any RV64 type. Unlike the *_W chips where RV32_REGISTER_NUM_LIMBS == RV64_WORD_NUM_LIMBS == 4, here the actual executor uses 8 limbs.
4. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but only imports MulWOpcode from openvm_riscv_transpiler (line 14). MulOpcode is not re-exported through crate::common::*. This will fail to compile under the aot feature.
5. CUDA extension references old Rv32* types (Compilation error under cuda)
extension/cuda.rs:11-18 imports Rv32BaseAluAir, Rv32LoadStoreAir, Rv32JalLuiAir, Rv32HintStoreAir, etc. — all renamed to Rv64*. Also missing: new *_W chip registrations for GPU backend. This is clearly a known deferral (CI disabled).
Code Quality Issues
6. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. A 64-byte value allocates capacity for 64 u64 words (512 bytes) instead of 8 (64 bytes). Functionally correct but 8× over-allocation.
7. Hardcoded 8 instead of named constant in io/mod.rs
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;Should use a named constant (e.g., WORD_SIZE) for consistency with the rest of the codebase.
8. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code (hint_buffer_u32!, RV32_ALU_OPCODE, old cfg(target_os = "zkvm") guard). The workspace points to the outer directory. The inner one should be removed.
9. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:302:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {Leftover temp function with debug marker — should be cleaned up or properly named before merge.
10. RV32_* constant naming in *_W and execution AOT code
The *_W execution files (base_alu_w, shift_w, mul_w, divrem_w) use RV32_REGISTER_AS and RV32_IMM_AS in their AOT trait implementations. While these constants are numerically equal to their RV64_* counterparts (1 and 0 respectively), it's misleading. Similarly, base_alu_w/execution.rs:166,234 uses RV32_REGISTER_NUM_LIMBS (4) where RV64_WORD_NUM_LIMBS (also 4) would be the semantically correct constant.
Architecture Observations (Positive)
-
Decoder crate is well-structured. RV64 shift amounts (6-bit
shamtviafunct6, mask0x3f) and W-suffix validation (shamt < 32) are correctly implemented.ITypeShamtproperly handles both 5-bit and 6-bit shift amounts. -
Range checker sizes (latest commit e12e402) are now correctly computed:
[1 << RV64_CELL_BITS, 2 * RV64_REGISTER_NUM_LIMBS * (1 << RV64_CELL_BITS)]=[256, 4096]. The carry bound formula2N × 2^Bfor N=8, B=8 is a standard conservative upper bound for N-limb multiplication intermediate values. -
ELF loader correctly validates
Class::ELF64, reads instructions in 4-byte chunks (RISC-V instructions are always 32-bit even for RV64), and constrains the address space to u32. -
Non-AOT execution paths are solid —
base_alu,shift,less_than,branch_eq,branch_lt,loadstore,hintstore,jalr,jal_lui,auipcall correctly useRV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS, andvm_read::<u8, 8>. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension via(result as i32 as i64 as u64).to_le_bytes(), and constrain upper bytes through XOR-based sign bit extraction. -
Transpiler utility functions (
from_r_type,from_i_type, etc.) correctly useRV64_REGISTER_NUM_LIMBSfor register address scaling.imm_to_u64andimm_to_bytesproperly handle 12-bit signed immediate sign-extension to 64 bits via the 24-bit encoding. -
Phantom sub-executors properly use
read_rv64_register(), 8-byte hint stream alignment, andu64::to_le_bytes(). -
Test coverage for new chips is thorough — comprehensive sign extension edge cases, negative tests for constraint violations, and boundary value testing. New transpiler tests with hand-written RV64 assembly (
rv64im-stress.S,rv64im-intrin.S) cover all RV64IM instruction categories.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #2: AOT impls reference Rv32*Executor types (8 impls across 4 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #3: load_sign_extend/aot.rs wrong const generic |
Dead AOT impl | Part of AOT port |
| Medium | #4: Missing MulOpcode import in mul_w AOT |
Compilation failure under aot |
Trivial fix |
| Medium | #5: CUDA extension not ported | GPU proving broken | Known deferral |
| Low | #6: Serializer 8× over-allocation | Wasted memory | Minor |
| Low | #7: Hardcoded 8 vs named constant |
Style | Minor |
| Low | #8: Inner guest/guest/ directory |
Confusing leftover | Cleanup |
| Low | #9: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #10: RV32_* naming in AOT paths |
Misleading, functionally correct | Tracked |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid. The execution, circuit, transpiler, and toolchain changes are well-structured and consistent. The latest commit correctly fixes the range checker sizes for RV64 multiplication. The *_W chip design with split-width adapters is clean.
Recommended before merge:
- Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin AOT impls across 4 files (trivial, 8 occurrences) - Remove the inner
guest/guest/directory - Remove
TEMP[jpw]marker / renametmp_convert_to_u8s - Track AOT and CUDA port as separate work items
|
Claude finished @shuklaayush's task in 5m 30s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously fixed issues (confirmed):
New in latest commit (2d0588c):
Bugs (Still Open)1.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
All behind #[cfg(feature = "cuda")] or #[cfg(feature = "aot")] — won't break default builds but will fail with those features.
2. AOT impls reference non-existent Rv32*Executor types (Compilation error under aot feature)
The actual struct types were renamed to Rv64*, but AOT impl blocks still name Rv32*:
| File | Lines | Has | Should be |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor<A> |
Rv64JalLuiExecutor<A> |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor<A> |
Rv64AuipcExecutor<A> |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor<A> |
Rv64JalrExecutor<A> |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
These are dead code — the impls implement traits for types that don't exist. Will fail to compile under aot feature.
3. AOT load_sign_extend impl uses wrong const generic — dead code
load_sign_extend/aot.rs:20-21 implements traits for LoadSignExtendExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> (4 limbs), but the actual type alias Rv64LoadSignExtendExecutor uses RV64_REGISTER_NUM_LIMBS (8 limbs) — see load_sign_extend/mod.rs:24-25. Same for metered impl at line 41. These impls will never be dispatched.
4. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but only imports MulWOpcode from openvm_riscv_transpiler (line 14). MulOpcode is not imported anywhere in the file or re-exported through crate::common::*. This will fail to compile under the aot feature.
5. AOT layer assumes 32-bit registers throughout (systemic)
The AOT x86 code generation layer is fundamentally broken for RV64:
- Register stride:
a / 4hardcoded injal_lui/execution.rs:109,auipc/execution.rs:106,jalr/execution.rs:114,load_sign_extend/aot.rs:115,loadstore/aot.rs:150,branch_lt/aot.rs:40, and all*_Wexecution files. With RV64's 8-byte registers, these produce wrong XMM register indices. - XMM helpers:
gpr_to_rv32_registerandxmm_to_gprincommon/mod.rs:25-85assume 2 RISC-V registers per XMM (32-bit × 2 = 64-bit XMM), but RV64 registers are 64 bits each. *_WAOT impls useRV32_REGISTER_NUM_LIMBS(4) in const generics where the actual executor types useRV64_REGISTER_NUM_LIMBS(8) — making these impls dead code.
The CI disabling (chore(ci): comment out aot workflows) acknowledges this. The entire AOT layer needs a redesign for 64-bit register values.
6. CUDA extension references old Rv32* types (Compilation error under cuda)
extension/cuda.rs:11-18 imports Rv32BaseAluAir, Rv32LoadStoreAir, Rv32JalLuiAir, Rv32HintStoreAir, etc. — all renamed to Rv64*. Also missing: new *_W chip registrations for GPU backend. Clearly a known deferral.
New Observations on Latest Commit (2d0588c)
7. riscv64im-unknown-none-elf target requires nightly -Z build-std
The target riscv64im-unknown-none-elf is not a built-in Rust target — it requires -Z build-std (nightly-only). This is correctly configured (lines 268-273 of build/src/lib.rs) and the nightly toolchain is already required for guest compilation. The ISA string riscv64im correctly encodes base integer (I) + multiplication/division (M).
Note: The build-std removing std and proc_macro means guest programs can only use no_std. The #[ignore] annotations on test_rv64im_std and test_hashmap with TODO(rv64-std) comments correctly acknowledge this limitation.
8. openvm_intrinsics cfg flag is well-integrated
The migration from cfg(target_os = "zkvm") to cfg(openvm_intrinsics) is clean:
- Set in
encode_rust_flags(build/src/lib.rs:310-311) check-cfglint configured in bothopenvm/Cargo.tomlandguest/Cargo.toml- Consistently used across
openvm/src/lib.rs,io/mod.rs,io/read.rs,memory.rs - No stale
cfg(target_os = "zkvm")references remain in the outer guest or toolchain crates
Code Quality Issues
9. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code:
- Uses
cfg(target_os = "zkvm")(old) vs the outer'scfg(openvm_intrinsics)(new) - Uses
hint_buffer_u32!/RV32_ALU_OPCODE(RV32) vs the outer'shint_buffer_u64!/RV64_ALU_OPCODE(RV64) - Lacks the
check-cfglint configuration foropenvm_intrinsics
The workspace points to the outer directory. The inner one should be removed to avoid confusion.
10. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. An 8× over-allocation. Functionally correct.
11. Hardcoded 8 instead of WORD_SIZE in io/mod.rs
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;WORD_SIZE is defined as 8 in platform/src/lib.rs:23. Should use the constant for consistency.
12. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:302:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {Leftover temp function with debug marker.
Architecture Observations (Positive)
-
Toolchain migration is thorough: The
openvm_intrinsicscfg flag is a cleaner approach thantarget_os = "zkvm", decoupling intrinsic availability from the target triple. Theriscv64im-unknown-none-elftarget correctly uses-Z build-stdwithoutstd. -
Entry point (
lib.rs:122-137) correctly usesld sp, 0(sp)(load doubleword) andSTACK_TOP: u64. The fault handler usessd x0, 1(x0)(store doubleword to misaligned address). -
Host hint stream (
host.rs) properly usesu64::to_le_bytes()for length encoding, consistent with guest-sideread_u64(). -
ELF loader correctly validates
Class::ELF64, usestry_into()for allu64→u32address conversions with error handling, and keeps 4-byte instruction chunks (RISC-V instructions are always 32-bit even in RV64). -
Decoder crate properly handles RV64 shift amounts (6-bit
shamtviafunct6, mask0x3f) and W-suffix validation (shamt < 32). -
Non-AOT execution paths are solid —
base_alu,shift,less_than,branch_eq,branch_lt,loadstore,hintstore,jalr,jal_lui,auipcall correctly useRV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS, andvm_read::<u8, 8>. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension via(result as i32 as i64 as u64).to_le_bytes(). -
Test coverage is thorough — hand-written RV64 assembly (
rv64im-stress.S,rv64im-intrin.S), comprehensive sign extension edge cases, negative tests for constraint violations. -
build-stdchange correctly removesstdandproc_macro, acknowledgingno_std-only guest programs with properly#[ignore]d std tests and clear TODO comments.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #2: AOT impls reference Rv32*Executor types (8 impls across 4 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #4: Missing MulOpcode import in mul_w AOT |
Compilation failure under aot |
Trivial fix |
| High | #3,5: AOT layer assumes 32-bit registers (systemic) | Entire AOT code path broken | Known deferral (CI disabled) |
| Medium | #6: CUDA extension not ported | GPU proving broken | Known deferral |
| Low | #9: Inner guest/guest/ directory |
Confusing stale RV32 leftover | Cleanup |
| Low | #10: Serializer 8× over-allocation | Wasted memory | Minor |
| Low | #11: Hardcoded 8 vs WORD_SIZE |
Style inconsistency | Minor |
| Low | #12: TEMP[jpw] marker |
Cleanup | Minor |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid. The latest commit cleanly migrates the guest toolchain to riscv64im-unknown-none-elf with proper no_std build configuration and the new openvm_intrinsics cfg flag. The execution, circuit, transpiler, and toolchain changes are well-structured and consistent.
Recommended before merge:
- Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin AOT impls across 4 files (trivial, 8 occurrences) - Add missing
MulOpcodeimport inmul_w/execution.rs - Remove the inner
guest/guest/directory - Remove
TEMP[jpw]marker / renametmp_convert_to_u8s - Track AOT and CUDA port as separate work items
|
Claude finished @876pol's task in 6m 1s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously fixed issues (confirmed):
NEW: Critical Bugs Found1. Deferral extension uses 4-byte register arrays instead of 8-byte (Soundness bug)The deferral extension's AIR columns, trace records, and range checks all use AIR columns ( pub rd_val: [T; RV32_REGISTER_NUM_LIMBS], // 4 elements, should be 8
pub rs_val: [T; RV32_REGISTER_NUM_LIMBS],Trace records ( pub rd_val: [u8; RV32_REGISTER_NUM_LIMBS], // 4 bytes, should be 8
pub rs_val: [u8; RV32_REGISTER_NUM_LIMBS],Memory bridge reads ( Address range checks ( Impact: The memory bridge interactions won't match what the system memory chip expects for 8-byte register reads. The AIR constraints are unsound — they allow reading only half a register. Files affected: 2. Deferral transpiler uses wrong register stride (Address bug)
RV32_REGISTER_NUM_LIMBS * dec_insn.rd, // stride 4, should be 8
RV32_REGISTER_NUM_LIMBS * dec_insn.rs1,Register addresses are computed as 3. Algebra extension uses 4-byte register writes and wrong transpiler strideCircuit ( let mut write_data = [0u8; RV32_REGISTER_NUM_LIMBS]; // 4 bytes, should be 8Writes only 4 bytes to an 8-byte register, leaving upper bytes unwritten. Transpiler ( F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rd), // stride 4, should be 8
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rs1),Same wrong register address stride as the deferral transpiler. Tests ( 4. Bigint transpiler uses wrong register stride
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rs1), // stride 4, should be 8
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rs2),Previously Reported Bugs (Still Open)5.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
6. AOT impls reference non-existent Rv32*Executor types (Compilation error under aot)
8 impls across 4 files use old Rv32* struct names:
| File | Lines | Has | Should be |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
7. AOT load_sign_extend impl uses wrong const generic — dead code
load_sign_extend/aot.rs:20-21, 40-41 implements for LoadSignExtendExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> (4 limbs) but the actual alias uses RV64_REGISTER_NUM_LIMBS (8).
8. AOT layer assumes 32-bit registers throughout (systemic)
Register stride /4, XMM helpers, *_W AOT impls all use 32-bit assumptions. CI is disabled (chore(ci): comment out aot workflows) — known deferral.
9. CUDA extension references old Rv32* types
extension/cuda.rs:11-18 imports non-existent Rv32* types. Known deferral.
New SHA2 Observations
The SHA2 RV64 migration is well-done overall. Constants correctly updated to RV64_*:
RV64_REGISTER_AS,RV64_MEMORY_AS,RV64_REGISTER_NUM_LIMBS,RV64_CELL_BITSall correct- Pointer handling properly constrains upper 4 bytes (limbs 4-7) to zero for 32-bit addressing
SHA2_READ_SIZE == SHA2_WRITE_SIZE == 8correctly handles 8-byte memory operations
Minor semantic issue in sha2_chips/main_chip/air.rs:345,350,355: The write loop uses STATE_READS and SHA2_READ_SIZE instead of STATE_WRITES and SHA2_WRITE_SIZE. Currently harmless because both pairs are equal (STATE_BYTES / 8), but semantically incorrect and would break if read/write sizes ever diverged.
Code Quality Issues
10. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:302:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {11. Serializer Vec::with_capacity 8× over-allocation
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes but with_capacity takes element count. Should be / 8.
12. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare the same package name. Inner directory uses stale RV32 code (hint_buffer_u32!, old cfg guards).
13. Deferral config field naming
extensions/deferral/circuit/src/extension/mod.rs:228-232 — Config fields named rv32i, rv32m but types are Rv64I, Rv64M, Rv64Io. GPU builder at extension/cuda.rs:136,141,146 references config.rv32i, config.rv32m — functionally correct but misleading.
Architecture Observations (Positive)
- SHA2 RV64 port is clean — pointer constraints properly zero-extend upper limbs, transpiler correctly uses
Rv64Sha2Opcode, and execution reads 8-byte registers. - Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation. - Non-AOT execution paths in the riscv extension are solid — consistent use of
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS,vm_read::<u8, 8>. *_Wadapter/core pattern is well-designed with proper sign extension via(result as i32 as i64 as u64).- Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment, guestopenvm_intrinsicscfg is clean.
Summary
| Priority | Issue | Impact |
|---|---|---|
| Critical | #1: Deferral AIR/trace uses 4-byte register arrays | Soundness: memory bridge mismatches, AIR constraints wrong |
| Critical | #2: Deferral transpiler wrong register stride | Wrong register addresses in transpiled instructions |
| Critical | #3: Algebra circuit 4-byte register writes + wrong transpiler stride | Partial register writes + wrong addresses |
| Critical | #4: Bigint transpiler wrong register stride | Wrong register addresses |
| High | #5: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
| High | #6: AOT Rv32*Executor → Rv64*Executor (4 files) |
Dead code + compilation failure |
| Medium | #7-9: AOT/CUDA systemic issues | Known deferral (CI disabled) |
| Low | #10-13: Code quality | Cleanup items |
The core riscv extension (non-AOT, non-CUDA) is solid, but the deferral, algebra, and bigint extensions have critical RV32_REGISTER_NUM_LIMBS → RV64_REGISTER_NUM_LIMBS migration gaps that affect both AIR soundness and transpiler correctness.
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSin AIR columns, trace records, range checks, and transpiler (affects ~50 locations across 6 files) - Fix algebra extension — update
is_eq.rsregister write size, transpiler stride, and test constants - Fix bigint transpiler — update register stride
- Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin AOT impls (trivial) - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT and CUDA port as separate work items
|
Claude finished @876pol's task in 6m 31s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously fixed issues (confirmed):
New in latest commit (6761341):
New: riscv-adapters Changes (Latest Commit) — Well DoneThe
Zero Bugs (Still Open)1.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
2. AOT impls reference non-existent Rv32*Executor types (Compilation error under aot)
8 impl blocks across 4 files use Rv32* struct names that were renamed to Rv64*:
| File | Lines | Has | Should be |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
3. Deferral extension uses 4-byte register arrays instead of 8-byte (Soundness bug)
The deferral extension's AIR columns use RV32_REGISTER_NUM_LIMBS (4) for register value arrays, but DEFAULT_BLOCK_SIZE = 8 — meaning the memory bus expects 8-byte register interactions.
AIR columns (output/air.rs:59-60, call/air.rs:230-231):
pub rd_val: [T; RV32_REGISTER_NUM_LIMBS], // 4 elements, should be 8
pub rs_val: [T; RV32_REGISTER_NUM_LIMBS],Trace records (output/trace.rs:72-73, call/trace.rs:234-235):
pub rd_val: [u8; RV32_REGISTER_NUM_LIMBS], // 4 bytes, should be 8
pub rs_val: [u8; RV32_REGISTER_NUM_LIMBS],Memory bridge reads (output/air.rs:260-276) read local.rd_val (4 elements) from the register address space. The memory system expects 8-byte register reads (DEFAULT_BLOCK_SIZE = 8). This makes the AIR constraints unsound.
Address range checks (output/air.rs:237-244, call/air.rs:306-313) use RV32_CELL_BITS * RV32_REGISTER_NUM_LIMBS (= 32 bits) — this happens to be correct since pointers are 32-bit, but should use RV64_* constants for clarity.
Impact: Memory bus interaction width mismatch will cause proof generation failure or unsound constraints.
4. Deferral transpiler uses wrong register stride
extensions/deferral/transpiler/src/lib.rs:85-86, 95-96:
RV32_REGISTER_NUM_LIMBS * dec_insn.rd, // stride 4, should be 8
RV32_REGISTER_NUM_LIMBS * dec_insn.rs1,Register x5 would be at address 20 instead of the correct 40 for RV64.
5. Algebra circuit + transpiler: wrong register width and stride
Circuit (algebra/circuit/src/modular_chip/is_eq.rs:726):
let mut write_data = [0u8; RV32_REGISTER_NUM_LIMBS]; // 4 bytes, should be 8Writes only 4 bytes to an 8-byte register.
Transpiler (algebra/transpiler/src/lib.rs:102-103, 126, 204-205):
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rd), // stride 4, should be 8Modular chip (algebra/circuit/src/modular_chip/mod.rs:42, 54, 69): Core AIR types parameterized with RV32_REGISTER_NUM_LIMBS (4) instead of RV64_REGISTER_NUM_LIMBS (8).
6. Bigint transpiler uses wrong register stride
extensions/bigint/transpiler/src/lib.rs:146-147:
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rs1), // stride 4, should be 8
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rs2),7. AOT layer assumes 32-bit registers throughout (systemic)
The entire AOT x86 code generation layer uses 4-limb register stride (a / 4), 32-bit XMM helpers, and RV32_REGISTER_NUM_LIMBS const generics. This is a known deferral — CI is disabled via chore(ci): comment out aot workflows.
8. CUDA extension references old Rv32* types
extension/cuda.rs:11-18 imports Rv32BaseAluAir, Rv32LoadStoreAir, etc. — all renamed. Also SHA2 CUDA files (sha2_main.cu:56-74, columns.cuh:27-29) still use RV32_REGISTER_NUM_LIMBS. Known deferral.
Code Quality Issues
9. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:302:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {10. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ uses stale RV32 code (hint_buffer_u32!, old cfg(target_os = "zkvm") guard). Should be removed.
11. Serializer Vec::with_capacity over-allocates
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes element count. Over-allocates by 8×.
12. Hardcoded 8 in io/mod.rs instead of named constant
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;Should use WORD_SIZE constant.
13. Keccak256 extension not yet ported
keccak256/circuit/src/keccakf_op/columns.rs:22 and xorin/columns.rs:25-29 use RV32_REGISTER_NUM_LIMBS (4) for pointer limb arrays. Not part of this PR but will need porting before keccak256 works with RV64.
Architecture Observations (Positive)
-
riscv-adapters port is clean and well-structured. The
pad_reg_val/compose_ptr/tracing_read_reg_ptrhelpers provide a clean abstraction for the 4-byte materialized pointer + 8-byte register value pattern. -
Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6, mask0x3f) and W-suffix validation (shamt < 32). -
Non-AOT execution paths in the riscv extension are solid — consistent use of
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS,vm_read::<u8, 8>. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension correctly. -
Transpiler utility functions (
from_r_type,from_i_type, etc.) correctly useRV64_REGISTER_NUM_LIMBSfor register address scaling. -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment, guestopenvm_intrinsicscfg is clean. -
ELF loader correctly validates
Class::ELF64and constrains addresses to u32.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #3: Deferral AIR uses 4-byte register arrays | Memory bus mismatch, unsound constraints | Open |
| Critical | #4: Deferral transpiler wrong register stride | Wrong register addresses | Open |
| Critical | #5: Algebra circuit 4-byte writes + wrong transpiler stride | Partial register writes + wrong addresses | Open |
| Critical | #6: Bigint transpiler wrong register stride | Wrong register addresses | Open |
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #2: AOT Rv32*Executor → Rv64*Executor (4 files) |
Dead code + compilation failure under aot |
Trivial rename |
| Medium | #7-8: AOT/CUDA systemic issues | Known deferral (CI disabled) | Tracked |
| Low | #9-12: Code quality | Cleanup items | Open |
The core riscv extension (non-AOT, non-CUDA) and the new riscv-adapters are solid. The adapter port in the latest commit is well-designed with correct constant usage and a clean helper API.
The deferral, algebra, and bigint extensions still have critical RV32_REGISTER_NUM_LIMBS → RV64_REGISTER_NUM_LIMBS migration gaps that affect both AIR soundness and transpiler correctness.
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSin AIR columns, trace records, range checks, and transpiler - Fix algebra extension — update
is_eq.rsregister write size,modular_chip/mod.rstype parameters, transpiler stride, and test constants - Fix bigint transpiler — update register stride
- Fix
config.rv32i→config.rv64iin 5 locations - Fix
Rv32*Executor→Rv64*Executorin AOT impls - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA, and keccak256 port as separate work items
|
Claude finished @876pol's task in 6m 6s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commits Previously fixed issues (confirmed):
New in latest commits:
NEW: riscv-adapters Changes — Well DoneThe
NEW: Bigint Extension — Mostly Good, AOT Dead CodeThe bigint transpiler is correctly updated — uses The bigint circuit core types are correctly updated — Bigint Bug: AOT impls reference non-existent
|
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
These are dead code — the impls implement traits for types that don't exist. Will fail to compile under aot feature.
Bigint CUDA/Test: RV32_CELL_BITS and Rv32*256Opcode references
bigint/circuit/src/cuda/mod.rs:15: ImportsRV32_CELL_BITS(should beRV64_CELL_BITS; same value but wrong name — used in 20+ locations in the file)bigint/circuit/src/tests.rs:655,716,782,847,905,971: UsesRv32*256Opcode::CLASS_OFFSET— should beRv64*256Opcodebigint/circuit/src/tests.rs:629,690,753,819,879,945: UsesRV32_CELL_BITS— should beRV64_CELL_BITSbigint/circuit/src/tests.rs:922,988: UsesRv32VecHeapBranchAdapterExecutor— should beRv64VecHeapBranchAdapterExecutorbigint/circuit/src/extension/cuda.rs:55-101: ReferencesRv32*256Airtypes — should beRv64*256Air
Bugs (Still Open from Previous Reviews)
1. config.rv32i should be config.rv64i (Compilation error under cuda/aot features)
Rv64ImConfig has field rv64i (line 101 of lib.rs), but 5 locations still reference config.rv32i:
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
2. RISCV AOT impls reference non-existent Rv32*Executor types (Compilation error under aot)
8 impl blocks across 4 files:
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
3. Deferral extension uses 4-byte register arrays instead of 8-byte (Soundness bug)
The deferral extension's AIR columns, trace records, and memory bridge interactions all use RV32_REGISTER_NUM_LIMBS (4) for register value arrays. With DEFAULT_BLOCK_SIZE = 8, the memory bus expects 8-byte register interactions. The memory_bridge.read() call passes local.rd_val (4 elements), creating a 4-byte read interaction that doesn't match the system's 8-byte register width.
AIR columns (deferral/circuit/src/output/air.rs:59-60, call/air.rs:230-231):
pub rd_val: [T; RV32_REGISTER_NUM_LIMBS], // 4 elements, should be 8
pub rs_val: [T; RV32_REGISTER_NUM_LIMBS],Trace records (output/trace.rs:72-73, call/trace.rs:234-235):
pub rd_val: [u8; RV32_REGISTER_NUM_LIMBS], // 4 bytes, should be 8
pub rs_val: [u8; RV32_REGISTER_NUM_LIMBS],Memory bridge reads (output/air.rs:260-276): Pass local.rd_val (4 elements) to memory_bridge.read(), creating a 4-byte memory interaction.
Address range checks (output/air.rs:237-244): Use RV32_CELL_BITS * RV32_REGISTER_NUM_LIMBS (= 32 bits) and local.rd_val[RV32_REGISTER_NUM_LIMBS - 1] — functionally correct for 32-bit pointer bounds but semantically wrong.
bytes_to_f (utils.rs:64-68): Uses RV32_CELL_BITS for composition shift — functionally correct (same value) but should reference RV64_CELL_BITS.
Impact: The memory bridge interactions create 4-byte reads for register values, but the system memory chip expects 8-byte register reads (matching DEFAULT_BLOCK_SIZE = 8). This will cause a memory bus argument mismatch during proof generation.
Note: The pointer values are correctly 32-bit (F_NUM_BYTES = 4), so the pointer arithmetic is fine. The issue is that the register read width must be 8 bytes for RV64, with only the lower 4 bytes used as the pointer value and upper 4 bytes constrained to zero.
4. Deferral transpiler uses wrong register stride
deferral/transpiler/src/lib.rs:85-86, 95-96:
RV32_REGISTER_NUM_LIMBS * dec_insn.rd, // stride 4, should be 8
RV32_REGISTER_NUM_LIMBS * dec_insn.rs1,Register x5 would be addressed at offset 20 instead of the correct 40 for RV64.
5. Algebra circuit + transpiler: wrong register width and stride
Execution code (algebra/circuit/src/execution.rs:395,408):
u32::from_le_bytes(exec_state.vm_read(RV32_REGISTER_AS, addr as u32))Reads 4 bytes from an 8-byte register — truncates register value. Since registers hold pointers (which are 32-bit), this produces correct values if upper bytes are zero, but the read width doesn't match the register width.
IsEq execution (algebra/circuit/src/modular_chip/is_eq.rs:726):
let mut write_data = [0u8; RV32_REGISTER_NUM_LIMBS]; // 4 bytes, should be 8Writes only 4 bytes to an 8-byte register.
Transpiler (algebra/transpiler/src/lib.rs:7,102-103,126,204-205):
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rd), // stride 4, should be 86. Deferral config/builder naming
deferral/circuit/src/extension/mod.rs:224-241: Config struct is Rv32DeferralConfig with fields rv32i: Rv64I, rv32m: Rv64M. Builder is Rv32DeferralCpuBuilder. These should be renamed to Rv64Deferral* or at least generic names.
7. AOT layer assumes 32-bit registers throughout (systemic — known deferral)
The AOT x86 code generation layer is fundamentally broken for RV64: register stride /4, XMM helpers, *_W AOT impls all use 32-bit assumptions. CI is disabled (chore(ci): comment out aot workflows).
8. CUDA extension references old Rv32* types (known deferral)
Both riscv/circuit/src/extension/cuda.rs and bigint/circuit/src/extension/cuda.rs reference old Rv32* types.
Code Quality Issues
9. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:302:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {10. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ uses stale RV32 code (hint_buffer_u32!, old cfg(target_os = "zkvm") guard). Should be removed.
11. Serializer Vec::with_capacity over-allocates
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements — 8× over-allocation.
Architecture Observations (Positive)
- riscv-adapters port is clean and well-structured. The
pad_reg_val/compose_ptr/tracing_read_reg_ptrhelpers provide a clean abstraction for the 4-byte pointer + 8-byte register value pattern. - Bigint transpiler correctly uses
RV64_REGISTER_NUM_LIMBSfor register stride. - Bigint circuit core types correctly use
RV64_CELL_BITSandRv64VecHeapAdapter*. - Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation. - Non-AOT execution paths in the riscv extension are solid — consistent use of
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS,vm_read::<u8, 8>. *_Wadapter/core pattern is well-designed with proper sign extension via(result as i32 as i64 as u64).- Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment, guestopenvm_intrinsicscfg is clean.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #3: Deferral AIR uses 4-byte register arrays | Memory bus mismatch, unsound constraints | Open |
| Critical | #4: Deferral transpiler wrong register stride | Wrong register addresses | Open |
| Critical | #5: Algebra circuit 4-byte writes + wrong transpiler stride | Partial register writes + wrong addresses | Open |
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #2: RISCV AOT Rv32*Executor → Rv64*Executor (4 files) |
Dead code + compilation failure | Trivial rename |
| High | NEW: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files) |
Dead code + compilation failure | Trivial rename |
| Medium | #6: Deferral config naming Rv32Deferral* |
Misleading naming | Cosmetic |
| Medium | #7-8: AOT/CUDA systemic issues | Known deferral (CI disabled) | Tracked |
| Low | #9-11: Code quality | Cleanup items | Open |
The core riscv extension (non-AOT, non-CUDA) and the new riscv-adapters are solid. The bigint transpiler and core circuit types are correctly updated.
The deferral and algebra extensions still have critical RV32_REGISTER_NUM_LIMBS → RV64_REGISTER_NUM_LIMBS migration gaps that affect both AIR soundness and transpiler correctness. These are the same issues from the previous review — they have not been addressed yet.
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSin AIR columns, trace records, memory bridge interactions, and transpiler stride (~20 locations across 5 files) - Fix algebra extension — update
is_eq.rsregister write size,execution.rsregister read size, transpiler stride (~15 locations across 3 files) - Fix
config.rv32i→config.rv64iin riscv circuit (5 locations) - Fix
Rv32*Executor→Rv64*Executorin riscv AOT impls (4 files, 8 occurrences) - Fix
Rv32*256Executor→Rv64*256Executorin bigint AOT impls (6 files, 12 occurrences) - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA, and keccak256 port as separate work items
|
Claude finished @876pol's task in 7m 12s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commits. The PR is at 23 commits, 100 changed files, +16066/−7176 lines. Three new commits since the last review:
Previously fixed issues (confirmed):
New Commit Assessment
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
2. AOT impls reference non-existent Rv32*Executor types in riscv circuit (Compilation error under aot)
8 impl blocks across 4 files use Rv32* struct names that were renamed to Rv64*:
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
3. AOT load_sign_extend impl uses wrong const generic — dead code
load_sign_extend/aot.rs:20-21, 40-41 implements for LoadSignExtendExecutor<A, { RV32_REGISTER_NUM_LIMBS }, LIMB_BITS> (4 limbs) but the actual alias uses RV64_REGISTER_NUM_LIMBS (8). Dead code that will fail to compile under aot.
4. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but MulOpcode is not imported.
5. Deferral extension uses 4-byte register arrays instead of 8-byte (Soundness bug)
AIR columns and trace records in output/air.rs:59-60 and call/air.rs:230-231 use RV32_REGISTER_NUM_LIMBS (4) for register value arrays, but DEFAULT_BLOCK_SIZE = 8. The memory bridge reads/writes only 4 bytes from each register, creating an AIR constraint mismatch.
6. Deferral transpiler uses wrong register stride
deferral/transpiler/src/lib.rs:85-86, 95-96 computes register addresses as RV32_REGISTER_NUM_LIMBS * dec_insn.rd (stride 4 instead of 8). Register x5 would be at address 20 instead of the correct 40.
7. NEW: Bigint circuit AOT impls reference non-existent Rv32* types (Compilation error under aot)
12 impl blocks across 6 files implement traits for Rv32*256Executor types that don't exist (renamed to Rv64*256Executor):
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
8. NEW: Bigint CUDA tests reference non-existent types (Compilation error under cuda)
bigint/circuit/src/tests.rs (behind #[cfg(feature = "cuda")]):
- Lines 655, 716, 782, 847, 905, 971: Reference
Rv32*256Opcode::CLASS_OFFSET— should beRv64*256Opcode::CLASS_OFFSET - Lines 922, 988: Reference
Rv32VecHeapBranchAdapterExecutor— should beRv64VecHeapBranchAdapterExecutor
Also, bigint/circuit/src/cuda/mod.rs imports from the old openvm_rv32_adapters package which no longer exists, and uses Rv32VecHeapAdapterCols/Rv32VecHeapAdapterRecord types.
9. Algebra circuit + transpiler: wrong register width and stride
algebra/circuit/src/modular_chip/is_eq.rs:726:[0u8; RV32_REGISTER_NUM_LIMBS]— writes only 4 bytes to an 8-byte registeralgebra/transpiler/src/lib.rs:102-103, 126, 204-205:RV32_REGISTER_NUM_LIMBS * dec_insn.rd— wrong stride
10. CUDA extension references old Rv32* types (systemic)
riscv/circuit/src/extension/cuda.rs, bigint/circuit/src/extension/cuda.rs, deferral/circuit/src/extension/cuda.rs, and CUDA .cuh files all reference Rv32* types. This is a known deferral — CI is disabled for CUDA.
11. AOT layer assumes 32-bit registers throughout (systemic)
Register stride /4, XMM helpers, *_W AOT impls all use 32-bit assumptions. Known deferral — CI is disabled for AOT.
Code Quality Issues
12. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:302:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {13. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes but with_capacity takes element count. Should be / 8.
14. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code (hint_buffer_u32!, old cfg(target_os = "zkvm") guard). Should be removed.
15. Hardcoded 8 instead of WORD_SIZE in io/mod.rs:60-61
let num_words = len.div_ceil(8);
let capacity = num_words * 8;Should use the WORD_SIZE constant for consistency.
Architecture Observations (Positive)
- riscv-adapters port is clean and well-structured. The
pad_reg_val/compose_ptr/tracing_read_reg_ptrhelpers provide a clean abstraction for the 4-byte pointer + 8-byte register pattern. - Memory address column optimization (commit 7b24bc6) is sound — zero-padding is enforced structurally at the memory bus level, not via explicit AIR assertions.
- Bigint transpiler is properly ported — all register strides use
RV64_REGISTER_NUM_LIMBS, opcode types useRv64*. - Non-AOT, non-CUDA execution paths across all extensions are solid — consistent
RV64_*constant usage. *_Wadapter/core pattern remains well-designed with proper sign extension.- Decoder crate correctly handles RV64 shift amounts and W-suffix validation.
- Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #5: Deferral AIR uses 4-byte register arrays | Soundness: memory bus mismatch | Open |
| Critical | #6: Deferral transpiler wrong register stride | Wrong register addresses | Open |
| Critical | #9: Algebra circuit 4-byte writes + wrong transpiler stride | Partial register writes + wrong addresses | Open |
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #2: Riscv AOT Rv32*Executor → Rv64*Executor (4 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #7: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #8: Bigint CUDA tests reference Rv32* types |
Compilation failure under cuda |
Trivial rename |
| Medium | #3-4: load_sign_extend AOT + missing MulOpcode import | Compilation failure under aot |
Part of AOT port |
| Medium | #10-11: CUDA + AOT systemic issues | Known deferral (CI disabled) | Tracked |
| Low | #12: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #13: Serializer 8× over-allocation | Wasted memory | Minor |
| Low | #14: Inner guest/guest/ directory |
Confusing leftover | Cleanup |
| Low | #15: Hardcoded 8 vs WORD_SIZE |
Style | Minor |
The core RV64 architecture (non-AOT, non-CUDA) is solid. The latest riscv-adapters commit is well-structured, the bigint transpiler is correctly ported, and the memory column optimization is sound. The main remaining risks are:
- Deferral and algebra extensions still have critical
RV32_REGISTER_NUM_LIMBS→RV64_REGISTER_NUM_LIMBSmigration gaps affecting AIR soundness and transpiler correctness - AOT/CUDA paths are systematically broken (acknowledged via CI disable)
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSin AIR columns, trace records, and transpiler (~20 locations) - Fix algebra extension — update
is_eq.rsregister write size + transpiler stride - Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files) and bigint AOT (6 files) impls - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA, and keccak256 port as separate work items
|
Claude finished @Tuanlinh12312's task in 6m 55s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commits: Previously fixed issues (confirmed):
Bugs (Still Open)1.
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
All behind #[cfg(feature = "cuda")] or #[cfg(feature = "aot")] — blocks compilation with those features.
2. Deferral extension: register arrays and transpiler still use RV32 (Soundness bug)
The deferral extension has not been ported to RV64 — every register-related constant is still RV32_REGISTER_NUM_LIMBS (4 bytes instead of 8):
Register arrays (should be RV64_REGISTER_NUM_LIMBS = 8):
output/air.rs:59-60—rd_val: [T; RV32_REGISTER_NUM_LIMBS],rs_val: [T; RV32_REGISTER_NUM_LIMBS]output/trace.rs:72-73— same patterncall/air.rs:230-231— same patterncall/trace.rs:234-235— same pattern
Range checks (wrong array index, wrong bit width):
output/air.rs:237-251— accesses[RV32_REGISTER_NUM_LIMBS - 1]= index 3 (should be 7), multiplies by 4 (should be 8)output/trace.rs:337-346— samecall/air.rs:306-345— samecall/trace.rs:177-180, 407-412— same
Transpiler stride (should be ×8 not ×4):
transpiler/src/lib.rs:85-86, 95-96—RV32_REGISTER_NUM_LIMBS * dec_insn.rd→ register x5 maps to address 20 instead of correct 40
Impact: Memory bus interaction width mismatch with 8-byte registers. AIR constraints unsound. Transpiler generates wrong register addresses.
3. Algebra extension: wrong register write size, type parameters, and transpiler stride (Soundness bug)
Write size (modular_chip/is_eq.rs:726):
let mut write_data = [0u8; RV32_REGISTER_NUM_LIMBS]; // 4 bytes, should be 8Writes only 4 bytes to an 8-byte register — partial register write.
Type parameters (modular_chip/mod.rs:42, 54, 69):
ModularIsEqualCoreAir parameterized with RV32_REGISTER_NUM_LIMBS (4) as WRITE_LIMBS — should be RV64_REGISTER_NUM_LIMBS (8). Creates structural mismatch between circuit expectations and executor.
Transpiler stride (transpiler/src/lib.rs:102-103, 126, 204-205):
F::from_usize(RV32_REGISTER_NUM_LIMBS * dec_insn.rd), // stride 4, should be 8Register x1 maps to address 4 instead of correct 8.
4. Bigint AOT impls reference non-existent Rv32* executor types (Compilation error under aot)
12 AotExecutor/AotMeteredExecutor impl blocks across 6 files use old Rv32* struct names:
| File | Lines | Has | Should be |
|---|---|---|---|
base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
5. Bigint test files reference non-existent Rv32VecHeapBranchAdapterExecutor
extensions/bigint/circuit/src/tests.rs:922, 988 — Uses Rv32VecHeapBranchAdapterExecutor but the type was renamed to Rv64VecHeapBranchAdapterExecutor in the riscv-adapters crate. These are CUDA tests and will fail to compile under cuda feature.
6. Riscv circuit AOT impls use wrong const generic RV32_REGISTER_NUM_LIMBS (Dead code)
AOT AotExecutor/AotMeteredExecutor impls for non-W base chips implement for { RV32_REGISTER_NUM_LIMBS } (4 limbs) but the actual executor types use RV64_REGISTER_NUM_LIMBS (8 limbs) — making these impls dead code:
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
21, 41 |
base_alu_w/execution.rs |
166, 234 |
7. CUDA extension references old Rv32* types (known deferral)
extension/cuda.rs imports Rv32BaseAluAir, Rv32LoadStoreAir, etc. — all renamed to Rv64*. Bigint CUDA files (cuda/mod.rs, cuda/src/bigint.cu) extensively reference Rv32VecHeapAdapter* types. Keccak256 CUDA files use RV32_REGISTER_NUM_LIMBS for register arrays and width calculations. All CUDA code is broken for RV64 — acknowledged via chore(ci): comment out aot workflows.
Code Quality Issues
8. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:320-321:
// TEMP[jpw]
pub fn tmp_convert_to_u8s...9. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes but with_capacity takes element count. Should be / 8.
10. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". Inner directory uses stale RV32 code (hint_buffer_u32!, old cfg(target_os = "zkvm") guard). Should be removed.
Architecture Observations (Positive)
-
Keccak256 RV64 port (latest commit) is solid — Rust code correctly uses
RV64_WORD_NUM_LIMBSfor pointer columns,expand_to_rv64_register()for zero-padded 8-byte bus interactions, andrv64_register_to_u32()for pointer extraction. Well-designed. -
Bigint RV64 port (latest commit) is correct in non-AOT paths — all execution code properly uses
RV64_REGISTER_NUM_LIMBS,RV64_REGISTER_AS,RV64_MEMORY_AS. Transpiler correctly usesRV64_REGISTER_NUM_LIMBSfor stride. Only AOT impls have stale naming. -
riscv-adapters port is clean and well-structured —
pad_reg_val,compose_ptr,tracing_read_reg_ptrhelpers correctly handle the 4-byte materialized pointer + 8-byte register pattern. ZeroRV32_*references in Rust code. -
Reduce-memory-address-columns changes are correct — optimizes column usage by materializing only significant limbs (
RV64_WORD_NUM_LIMBS = 4for pointers) while zero-padding to full register width in bus interactions. Applied cleanly to hintstore, loadstore, and SHA2 chips. -
Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6, mask0x3f) and W-suffix validation (shamt < 32). -
Non-AOT, non-CUDA execution paths across the riscv extension remain solid — consistent
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS,vm_read::<u8, 8>. -
*_Wadapter/core pattern is well-designed with proper sign extension via(result as i32 as i64 as u64).
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #2: Deferral extension fully unported | Unsound AIR constraints, wrong transpiler addresses | Open |
| Critical | #3: Algebra extension wrong register width + stride | Partial register writes, wrong addresses | Open |
| High | #1: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #4: Bigint AOT Rv32* → Rv64* (12 impls, 6 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #5: Bigint tests Rv32VecHeapBranch* → Rv64* |
Compilation failure under cuda |
Trivial rename |
| High | #6: Riscv AOT impls wrong const generic | Dead code under aot |
Part of AOT port |
| Medium | #7: CUDA extension not ported (all extensions) | GPU proving broken | Known deferral |
| Low | #8: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #9: Serializer 8× over-allocation | Wasted memory | Minor |
| Low | #10: Inner guest/guest/ directory |
Confusing leftover | Cleanup |
The core riscv extension, riscv-adapters, bigint (non-AOT), and keccak256 (Rust) are solid. The latest commits demonstrate a correct and consistent RV64 migration pattern with proper pointer materialization (4-byte columns) and zero-padded register bus interactions (8 bytes).
The deferral and algebra extensions are the primary remaining risks — they have critical RV32_REGISTER_NUM_LIMBS usage that produces wrong register array sizes, wrong transpiler addresses, and unsound AIR constraints.
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSacross 4 circuit files and transpiler (~30 occurrences) - Fix algebra extension — update
is_eq.rswrite size,mod.rstype parameters, and transpiler stride - Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix bigint AOT
Rv32*→Rv64*in 6 files (trivial rename) - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA, and keccak256 CUDA port as separate work items
|
Claude finished @Tuanlinh12312's task in 7m 8s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest 5 commits: Previously fixed issues (confirmed):
New commits assessment:
Bugs (Still Open)1. Deferral extension uses 4-byte register arrays instead of 8-byte (Soundness bug)The deferral extension's AIR columns, trace records, and memory bridge interactions all use AIR columns ( pub rd_val: [T; RV32_REGISTER_NUM_LIMBS], // 4 elements, should be 8
pub rs_val: [T; RV32_REGISTER_NUM_LIMBS],Trace records ( pub rd_val: [u8; RV32_REGISTER_NUM_LIMBS], // 4 bytes, should be 8
pub rs_val: [u8; RV32_REGISTER_NUM_LIMBS],Memory bridge reads ( Address range checks ( Impact: Memory bus interaction width mismatch → unsound AIR constraints. Files affected: 2. Deferral transpiler uses wrong register stride (Address bug)
RV32_REGISTER_NUM_LIMBS * dec_insn.rd, // stride 4, should be 8
RV32_REGISTER_NUM_LIMBS * dec_insn.rs1,Register x5 at offset 20 instead of the correct 40 for RV64. 3. Deferral config field naming mismatch
pub rv32i: Rv64I, // Field named "rv32i" but type is Rv64I
pub rv32m: Rv64M, // Field named "rv32m" but type is Rv64MThis is cosmetic (functionally works) but will cause confusion and creates a landmine for anyone adding feature-gated code that checks field names. 4.
|
| File | Line | Feature gate |
|---|---|---|
extensions/riscv/circuit/src/lib.rs |
297 | cuda |
extensions/riscv/circuit/src/mul/tests.rs |
289 | cuda |
extensions/riscv/circuit/src/mulh/tests.rs |
530 | aot |
extensions/riscv/circuit/src/jalr/tests.rs |
586 | aot |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 | aot |
5. AOT impls reference non-existent Rv32*Executor types (riscv extension)
8 impl blocks across 4 files implement traits for struct names that no longer exist:
| File | Lines | Has | Should be |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
Dead code under aot feature — will fail to compile when AOT is enabled.
6. AOT impls in riscv extension use wrong const generics — dead code
Non-W chips (base_alu, less_than, divrem) define Rv64*Executor with RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4):
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
20, 41 |
These impls implement traits for types that never get instantiated in RV64.
7. Bigint AOT impls reference non-existent Rv32* types
Same pattern as riscv extension — actual types renamed to Rv64* but AOT impls still use Rv32*:
| File | Lines | Has | Should be |
|---|---|---|---|
base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
8. Bigint CUDA test code references old types
extensions/bigint/circuit/src/tests.rs — behind #[cfg(feature = "cuda")]:
- Lines 655, 716, 782, 847, 905, 971: Reference
Rv32*Opcode::CLASS_OFFSET(renamed toRv64*) - Lines 922, 988: Reference
Rv32VecHeapBranchAdapterExecutor(renamed toRv64*)
9. CUDA extensions not ported (systemic — known deferral)
All CUDA code across riscv, keccak256, sha2, bigint, and deferral extensions still references old Rv32* types and RV32_REGISTER_NUM_LIMBS. CI workflows are disabled for these. Tracked as separate work items.
10. AOT layer assumes 32-bit registers throughout (systemic — known deferral)
The entire AOT x86 code generation layer uses 4-limb register stride (a / 4), 32-bit XMM helpers, and RV32_REGISTER_NUM_LIMBS const generics. CI workflows are disabled. Needs fundamental redesign for 64-bit register values.
Code Quality Issues
11. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:320-321:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {12. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. Should be / 8 or use Vec::new().
13. Hardcoded 8 instead of WORD_SIZE in io/mod.rs
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;WORD_SIZE is defined as 8 in platform/src/lib.rs:23. Should use the constant.
14. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code:
- Uses
hint_buffer_u32!(RV32) vs the outer'shint_buffer_u64!(RV64) - Uses
cfg(target_os = "zkvm")(old) vscfg(openvm_intrinsics)(new) - Uses
RV32_ALU_OPCODEvsRV64_ALU_OPCODE
15. Bigint pointer truncation lacks explicit safety assertions
In extensions/bigint/circuit/src/base_alu.rs (and other bigint execution files), pointer values are extracted as:
u64::from_le_bytes(rs1_ptr) as u32 // Silent truncationUnlike the algebra extension which uses debug_assert!(val <= u32::MAX as u64), the bigint code silently truncates. While the VM invariants likely guarantee this is safe, adding explicit debug assertions would be consistent with the algebra extension's approach.
Architecture Observations (Positive)
-
riscv-adapters port is exemplary — zero
RV32_*references, clean helper API (pad_reg_val,compose_ptr,tracing_read_reg_ptr), proper 4-byte pointer storage with 8-byte padding in constraints. -
Algebra extension (feat: rv64 algebra #2727) is clean — all
RV32_REGISTER_NUM_LIMBSproperly migrated toRV64_REGISTER_NUM_LIMBS, explicit pointer validation withdebug_assert!. -
Keccak256 extension (feat: rv64 keccak256 #2724) is well-migrated —
rv64_register_to_u32helper correctly reads 8-byte registers and validates upper bytes are zero. -
Reduce memory columns (fix: rv64 reduce memory address columns #2733) is a good optimization — stores only the necessary 4 pointer bytes in columns, pads to 8 in AIR constraints, saving trace width.
-
Non-AOT execution paths in the riscv extension are solid — consistent use of
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS, andvm_read::<u8, 8>. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension correctly. -
Decoder crate properly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment,openvm_intrinsicscfg flag is cleanly integrated.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #1: Deferral AIR/trace uses 4-byte register arrays | Memory bus mismatch, unsound constraints | Open |
| Critical | #2: Deferral transpiler wrong register stride | Wrong register addresses | Open |
| High | #4: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #5: AOT Rv32*Executor → Rv64*Executor (riscv, 4 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #6: AOT impls wrong const generics (riscv, 4 files) | Dead code under aot |
Part of AOT port |
| High | #7: Bigint AOT Rv32* → Rv64* (6 files) |
Dead code + compilation failure under aot |
Trivial rename |
| Medium | #3: Deferral config rv32i/rv32m field naming |
Misleading, functionally correct | Cosmetic fix |
| Medium | #8: Bigint CUDA test old types | Compilation failure under cuda |
Part of CUDA port |
| Medium | #9-10: AOT/CUDA systemic | Known deferral (CI disabled) | Tracked |
| Low | #11: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #12: Serializer 8× over-allocation | Wasted memory | Minor |
| Low | #13: Hardcoded 8 vs WORD_SIZE |
Style | Minor |
| Low | #14: Duplicate guest/guest/ directory |
Confusing leftover | Cleanup |
| Low | #15: Bigint silent pointer truncation | Missing safety assertion | Consistency |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid. The latest commits (algebra, keccak256, riscv-adapters, reduce memory columns) are well-structured with proper RV64 constant usage. The adapter pattern in riscv-adapters is particularly clean.
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSin AIR columns, trace records, range checks, and transpiler (~30 locations across 5 files) - Fix
config.rv32i→config.rv64iin 5 riscv locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin AOT impls across riscv (4 files) and bigint (6 files) — trivial renames - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA, and deferral CUDA port as separate work items
|
Claude finished @Tuanlinh12312's task in 6m 0s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewReviewing after the latest commits including Previously fixed issues (confirmed):
Bugs (Still Open)1. Deferral extension uses 4-byte register arrays instead of 8-byte (Soundness bug — Critical)The deferral extension's AIR columns, trace records, and memory bridge interactions all use AIR columns — arrays sized at 4 instead of 8:
Trace records — arrays sized at 4 instead of 8:
Memory bridge reads — Address range checks — Note: The pointer values are correctly 32-bit. The approach used in 2. Deferral transpiler uses wrong register stride (Critical)
RV32_REGISTER_NUM_LIMBS * dec_insn.rd, // stride 4, should be 8
RV32_REGISTER_NUM_LIMBS * dec_insn.rs1,Register x5 would be addressed at offset 20 instead of the correct 40 for RV64. 3.
|
| File | Line |
|---|---|
riscv/circuit/src/lib.rs |
297 |
riscv/circuit/src/mul/tests.rs |
289 |
riscv/circuit/src/mulh/tests.rs |
530 |
riscv/circuit/src/jalr/tests.rs |
586 |
riscv/circuit/src/mul_w/tests.rs |
416 |
4. RISCV AOT impls reference non-existent Rv32*Executor types (Compilation error under aot)
8 impl blocks across 4 files use Rv32* struct names that were renamed to Rv64*:
| File | Lines | Has | Should be |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
5. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
6. Bigint CUDA references old Rv32*Air types (Compilation error under cuda)
extensions/bigint/circuit/src/extension/cuda.rs:55-101 references Rv32BaseAlu256Air, Rv32LessThan256Air, Rv32BranchEqual256Air, Rv32BranchLessThan256Air, Rv32Multiplication256Air, Rv32Shift256Air — all renamed to Rv64*.
7. RISCV CUDA extension references old Rv32* types (Compilation error under cuda)
extensions/riscv/circuit/src/extension/cuda.rs:12-18 imports Rv32BaseAluAir, Rv32LoadStoreAir, Rv32JalLuiAir, Rv32HintStoreAir, etc. — all renamed to Rv64*. Also missing: new *_W chip registrations for GPU backend.
8. AOT layer assumes 32-bit registers throughout (Systemic — known deferral)
The AOT x86 code generation layer is fundamentally broken for RV64: register stride a / 4, XMM helpers (gpr_to_rv32_register, xmm_to_gpr), and *_W AOT impls all use 32-bit assumptions. The CI disabling (chore(ci): comment out aot workflows) acknowledges this.
Code Quality Issues
9. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:320-321:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {Leftover temp function with debug marker — should be cleaned up or properly named.
10. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code (hint_buffer_u32!, old cfg(target_os = "zkvm") guard, RV32_ALU_OPCODE). Should be removed.
11. Serializer Vec::with_capacity over-allocates 8x
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. A 64-byte value allocates capacity for 64 u64 words (512 bytes) instead of 8 (64 bytes). Functionally correct but 8x memory waste.
12. Deferral config naming (Rv32Deferral* → should be Rv64Deferral*)
extensions/deferral/circuit/src/extension/mod.rs:224-250:
Rv32DeferralConfigstruct with fieldsrv32i: Rv64I,rv32m: Rv64MRv32DeferralCpuBuilder,Rv32DeferralGpuBuilder
These names are misleading — they contain RV64 types but carry the RV32 prefix.
Architecture Observations (Positive)
-
Algebra extension is fully and correctly ported — all register operations use
RV64_REGISTER_NUM_LIMBS,RV64_REGISTER_AS, andu64::from_le_bytes(). Theis_eq.rswrite data correctly uses 8-byte arrays. Zero remainingRV32_REGISTER_NUM_LIMBSreferences. -
ECC extension is clean — no
RV32_references found. -
riscv-adapters port is solid. The
pad_reg_val/compose_ptr/tracing_read_reg_ptrhelpers inhelpers.rsprovide a clean abstraction for the 4-byte pointer + 8-byte register pattern. ZeroRV32_*references. -
Keccak256 non-CUDA code is correctly updated — uses
RV64_WORD_NUM_LIMBSthroughout. -
Bigint transpiler is correctly ported — uses
RV64_REGISTER_NUM_LIMBSfor register stride. -
Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6, mask0x3f) and W-suffix validation (shamt < 32). -
Non-AOT, non-CUDA execution paths across riscv, algebra, ECC, keccak256, sha2, and bigint extensions are solid — consistent
RV64_*constant usage. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension via(result as i32 as i64 as u64). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment,openvm_intrinsicscfg flag replaces oldtarget_os = "zkvm", ELF64 validation,riscv64im-unknown-none-elftarget. -
Memory address column optimization (commit
7b24bc6) is sound — zero-padding enforced structurally at the memory bus level.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #1: Deferral AIR uses 4-byte register arrays (8 array defs, ~20 index/range refs) | Memory bus mismatch, unsound constraints | Open |
| Critical | #2: Deferral transpiler wrong register stride (4 locations) | Wrong register addresses | Open |
| High | #3: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #4: Riscv AOT Rv32*Executor → Rv64*Executor (4 files, 8 impls) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #5: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files, 12 impls) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #6-7: Bigint + Riscv CUDA extensions reference old Rv32* types |
Compilation failure under cuda |
Known deferral |
| Medium | #8: AOT layer assumes 32-bit registers (systemic) | Entire AOT path broken | Known deferral (CI disabled) |
| Low | #9: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #10: Inner guest/guest/ directory |
Confusing stale RV32 leftover | Cleanup |
| Low | #11: Serializer 8x over-allocation | Wasted memory | Minor |
| Low | #12: Deferral config naming Rv32Deferral* |
Misleading | Cosmetic |
The core RV64 architecture (non-AOT, non-CUDA paths) is solid across all ported extensions. The algebra, ECC, keccak256, riscv-adapters, and toolchain changes are well-structured with correct constant usage. The *_W chip design with split-width adapters is clean.
The deferral extension remains the primary critical risk — it still has RV32_REGISTER_NUM_LIMBS usage that causes AIR soundness issues and wrong transpiler register addresses.
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSin AIR columns, trace records, memory bridge interactions, range checks, and transpiler stride (~30 locations across 5 files). Follow theriscv-adapters/helpers.rspattern for pointer handling. - Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files) and bigint AOT (6 files) impls (trivial renames) - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA, and deferral config naming as separate work items
|
Claude finished @Tuanlinh12312's task in 3m 10s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commits Previously fixed issues (confirmed):
NEW: ECC, Pairing, Algebra ChangesAlgebra Extension — ✅ Well DoneThe
ECC Extension — ✅ Well DoneThe
Pairing Extension —
|
| File | Line |
|---|---|
extensions/riscv/circuit/src/lib.rs |
297 |
extensions/riscv/circuit/src/mul/tests.rs |
289 |
extensions/riscv/circuit/src/mulh/tests.rs |
530 |
extensions/riscv/circuit/src/jalr/tests.rs |
586 |
extensions/riscv/circuit/src/mul_w/tests.rs |
416 |
6. RISCV AOT impls reference non-existent Rv32*Executor types (Compilation error under aot)
8 impl blocks across 4 files implement traits for Rv32* struct names that were renamed to Rv64*:
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
7. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
8. AOT layer assumes 32-bit registers throughout (systemic — known deferral)
The entire AOT x86 code generation layer uses 4-limb register stride (a / 4), 32-bit XMM helpers, and RV32_REGISTER_NUM_LIMBS const generics. CI is disabled via chore(ci): comment out aot workflows.
9. CUDA extensions not ported (systemic — known deferral)
riscv/circuit/src/extension/cuda.rs, bigint/circuit/src/extension/cuda.rs, deferral/circuit/src/extension/cuda.rs, pairing/circuit/src/cuda.rs all reference old Rv32* types.
Code Quality Issues
10. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:320-321:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {11. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ uses stale RV32 code (hint_buffer_u32!, old cfg(target_os = "zkvm") guard, RV32_ALU_OPCODE). Should be removed.
12. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes element count. Should be / 8.
13. Deferral config/builder naming
Rv32DeferralConfig, Rv32DeferralCpuBuilder, Rv32DeferralGpuBuilder in deferral/circuit/src/extension/mod.rs:224-250 — these wrap Rv64I, Rv64M, Rv64Io types but carry the Rv32 prefix.
Architecture Observations (Positive)
-
Algebra, ECC, and pairing extension ports (latest 3 commits) are well-structured. The algebra extension correctly uses
RV64_REGISTER_NUM_LIMBSthroughout its circuit, transpiler, and execution code. ECC transpiler is clean. -
riscv-adapters port is clean with good helper abstractions:
pad_reg_val,compose_ptr,tracing_read_reg_ptrcorrectly handle the 4-byte materialized pointer + 8-byte register pattern. ZeroRV32_*references. -
Memory address column optimization (commit 7b24bc6) is sound — materializes only significant limbs (
RV64_WORD_NUM_LIMBS = 4for pointers) while zero-padding to full register width in bus interactions. -
Keccak256 and SHA2 Rust code is clean — no remaining
RV32_*references in non-CUDA files. -
Non-AOT, non-CUDA execution paths across the riscv extension remain solid — consistent use of
RV64_REGISTER_AS,RV64_REGISTER_NUM_LIMBS,vm_read::<u8, 8>. -
*_Wadapter/core pattern is well-designed with proper sign extension via(result as i32 as i64 as u64). -
Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment, guestopenvm_intrinsicscfg is clean.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #3: Deferral AIR uses 4-byte register arrays | Soundness: memory bus mismatch | Open |
| Critical | #4: Deferral transpiler wrong register stride | Wrong register addresses | Open |
| High | #1: Pairing transpiler wrong register stride | Wrong register addresses | New |
| High | #5: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Open |
| High | #6: Riscv AOT Rv32*Executor → Rv64*Executor (4 files) |
Dead code + compilation failure under aot |
Open |
| High | #7: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files) |
Dead code + compilation failure under aot |
Open |
| Medium | #2: Pairing CUDA builder Rv32* types |
Compilation failure under cuda |
New |
| Medium | #8-9: AOT/CUDA systemic issues | Known deferral (CI disabled) | Tracked |
| Low | #10: TEMP[jpw] marker |
Cleanup | Open |
| Low | #11: Inner guest/guest/ directory |
Confusing stale RV32 leftover | Open |
| Low | #12: Serializer 8× over-allocation | Wasted memory | Open |
| Low | #13: Deferral config naming Rv32Deferral* |
Misleading naming | Open |
The core RV64 architecture (non-AOT, non-CUDA) is solid across the riscv, keccak256, sha2, algebra, ECC, and bigint extensions. The latest commits (algebra, ECC, pairing) are well-structured and demonstrate correct RV64_* constant usage.
The deferral extension remains the primary risk — it has critical RV32_REGISTER_NUM_LIMBS usage that produces wrong register array sizes, wrong execution reads, wrong transpiler addresses, and unsound AIR constraints. The pairing transpiler also has a wrong register stride.
Recommended before merge:
- Fix deferral extension — update all
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSacross circuit AIR, trace, execution, and transpiler (~40 occurrences) - Fix pairing transpiler — change
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBS(3 occurrences) - Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files) and bigint AOT (6 files) impls - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA, and deferral rename as separate work items
|
Claude finished @Tuanlinh12312's task in 6m 41s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commits Previously critical bugs now FIXED:
Bugs (Still Open)1. Pairing transpiler uses
|
| File | Line | Feature gate |
|---|---|---|
riscv/circuit/src/lib.rs |
297 | cuda |
riscv/circuit/src/mul/tests.rs |
289 | cuda |
riscv/circuit/src/mulh/tests.rs |
530 | aot |
riscv/circuit/src/jalr/tests.rs |
586 | aot |
riscv/circuit/src/mul_w/tests.rs |
416 | aot |
3. RISCV AOT impls reference non-existent Rv32*Executor types (Compilation error under aot)
8 impl blocks across 4 files implement traits for struct names that were renamed to Rv64*:
| File | Lines | Has | Should be |
|---|---|---|---|
jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
jalr/execution.rs |
97, 185 | Rv32JalrExecutor |
Rv64JalrExecutor |
hintstore/execution.rs |
111, 157 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
4. RISCV AOT impls for non-W chips use wrong const generic — dead code
Non-W chips define Rv64*Executor with RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4) — making them dead code that never gets dispatched:
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
20, 41 |
Note: The *_W chips' AOT impls also use RV32_REGISTER_NUM_LIMBS (4) but this happens to equal RV64_WORD_NUM_LIMBS (4) — functionally correct but semantically misleading. Should use RV64_WORD_NUM_LIMBS instead.
5. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
6. CUDA extensions not ported (Known deferral — CI disabled)
All CUDA code across riscv, bigint, deferral, pairing, keccak256, and sha2 extensions still references old Rv32* types. Specific examples:
riscv/circuit/src/extension/cuda.rs— importsRv32BaseAluAir,Rv32LoadStoreAir, etc.bigint/circuit/src/extension/cuda.rs— referencesRv32*256Airtypesdeferral/circuit/src/output/cuda.rsandcall/cuda.rs— usesRV32_CELL_BITSinstead ofRV64_CELL_BITSpairing/circuit/src/cuda.rs— usesRv32PairingGpuBuilder,Rv32PairingConfig,Rv32ModularBuilderpairing/circuit/src/lib.rs:21—pub use cuda::Rv32PairingGpuBuilder as Rv64PairingBuilder(aliasing wrong name)
This is acknowledged via chore(ci): comment out aot workflows.
7. AOT layer assumes 32-bit registers throughout (Known deferral — CI disabled)
Register stride a / 4, XMM helpers (gpr_to_rv32_register, xmm_to_gpr), and AOT const generics all assume 32-bit register layout. Needs fundamental redesign for 64-bit register values.
Code Quality Issues
8. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:320-321:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {9. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. Should be / 8.
10. Duplicate openvm-riscv-guest directory
Both extensions/riscv/guest/Cargo.toml and extensions/riscv/guest/guest/Cargo.toml declare package.name = "openvm-riscv-guest". The inner guest/guest/ is stale RV32 code (hint_buffer_u32!, old cfg(target_os = "zkvm") guard, RV32_ALU_OPCODE). Should be removed.
11. Hardcoded 8 instead of WORD_SIZE in io/mod.rs
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;Should use the WORD_SIZE constant defined in platform/src/lib.rs:23.
Architecture Observations (Positive)
-
Deferral extension port (latest commit 01ac821) is well-done. Follows the established pattern:
RV64_WORD_NUM_LIMBSfor materialized pointer columns,expand_to_rv64_register()for 8-byte padding in memory bus interactions. Transpiler correctly usesRV64_REGISTER_NUM_LIMBSfor register stride. -
Pairing extension port (b7308ae) is clean in non-CUDA code. Config correctly uses
Rv64PairingConfigwith proper field names. -
riscv-adapters remain exemplary — zero
RV32_*references, cleanpad_reg_val/compose_ptr/tracing_read_reg_ptrhelpers. -
Non-AOT, non-CUDA execution paths across all extensions (riscv, algebra, ECC, keccak256, sha2, bigint, deferral) are solid — consistent
RV64_*constant usage. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension correctly. -
Decoder crate properly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment,openvm_intrinsicscfg flag cleanly replaces oldtarget_os = "zkvm".
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| Critical | #1: Pairing transpiler wrong register stride | Wrong phantom register addresses | NEW — trivial fix |
| High | #2: config.rv32i → config.rv64i (5 locations) |
Compilation failure under cuda/aot |
Trivial fix |
| High | #3: Riscv AOT Rv32*Executor → Rv64*Executor (4 files) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #4: Riscv non-W AOT impls wrong const generic (4 files) | Dead code under aot |
Part of AOT port |
| High | #5: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files) |
Dead code + compilation failure under aot |
Trivial rename |
| Medium | #6: CUDA extensions not ported (all extensions) | GPU proving broken | Known deferral (CI disabled) |
| Medium | #7: AOT layer assumes 32-bit registers (systemic) | Entire AOT path broken | Known deferral (CI disabled) |
| Low | #8: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #9: Serializer 8× over-allocation | Wasted memory | Minor |
| Low | #10: Inner guest/guest/ directory |
Confusing stale RV32 leftover | Cleanup |
| Low | #11: Hardcoded 8 vs WORD_SIZE |
Style inconsistency | Minor |
Progress Since Last Review
The deferral extension — previously the most critical open issue — has been fully and correctly ported to RV64. The new commit follows the established riscv-adapters pattern and eliminates all non-CUDA RV32_* references.
Recommended Before Merge
- Fix pairing transpiler — change
RV32_REGISTER_NUM_LIMBStoRV64_REGISTER_NUM_LIMBSon lines 3, 47, 48 (trivial, 1 file) - Fix
config.rv32i→config.rv64iin 5 locations (trivial) - Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files) and bigint AOT (6 files) — trivial renames - Remove inner
guest/guest/directory andTEMP[jpw]marker - Track AOT, CUDA port as separate work items
|
Claude encountered an error —— View job Code Review: RV64 Support PR (Re-review after latest sync)
|
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
dada909 to
e4aab20
Compare
Resolves INT-6198, INT-6199, INT-6200. Migrates the `openvm`, `openvm-platform` crates from RV32 to RV64. **Scope note:** This PR only covers the guest-side compilation path (crate definitions, assembly, types, cfg gating). The actual execution logic for the I/O instructions (how the interpreter handles hint/reveal/print operations at runtime) is handled separately in the interpreter PR. Compilation and interpreting with the new openvm toolchain have been tested with guest programs from this repo: https://github.com/axiom-crypto/rv64-minimal-guest-test. All `#[cfg(target_os = "zkvm")]` / `#[cfg(not(target_os = "zkvm"))]` attributes replaced with `#[cfg(openvm_intrinsics)]` / `#[cfg(not(openvm_intrinsics))]`. This decouples guest-code gating from the risc0 custom target OS and instead uses a custom cfg flag injected via rustflags. = "riscv64"` Conditional dependency sections switched from `cfg(target_os = "zkvm")` to `cfg(target_arch = "riscv64")`, and the guest dependency changed from `openvm-rv32im-guest` to `openvm-rv64im-guest`. Added `check-cfg` for `openvm_intrinsics` to suppress warnings. - `WORD_SIZE`: 4 → 8 (`platform/src/lib.rs`) - `STACK_TOP`, `TEXT_START`, `is_guest_memory`: `u32` → `u64` (`platform/src/memory.rs`) - `STACK_TOP` static in entry point: `u32` → `u64` (`openvm/src/lib.rs`) - `sw x0, 1(x0)` → `sd x0, 1(x0)` (fault instruction, `openvm/src/lib.rs`) - `lw sp, 0(sp)` → `ld sp, 0(sp)` (entry point stack pointer load, `openvm/src/lib.rs`) Replaced with glibc implementations compiled for `riscv64 -march=rv64im` (source: [glibc 2.42](https://elixir.bootlin.com/glibc/glibc-2.42.9000/source/string)). Uses 64-bit `ld`/`sd` operations throughout. All hint/reveal/read functions updated to operate on 8-byte units: - `read_u32()` → `read_u64()` - Hint stream reads use `u64` length prefixes (8 bytes instead of 4) - Serde serializer/deserializer operate on `u64` chunks - `hint_buffer_u32!` equivalent updated for 64-bit **Note:** `reveal_bytes32` has **not** been changed — it still reveals 32 bytes (256 bits), but just reads them in 8-byte chunks instead of 4-byte chunks. Removed the PAL ABI stubs (`sys_halt`, `sys_read`, `sys_write`, `sys_rand`, etc.) that were required for `std` compatibility with the risc0 `riscv32im-risc0-zkvm-elf` target. No longer needed since rv64 uses a standard target (`riscv64im-unknown-none-elf`) without a custom OS ABI.
…64im (#2446) Transpiler PRs should be reviewed in this order: #2446, #2447, #2448. Summary of this PR: - Some files related to the decoding of RV32 are copied from https://github.com/GregAC/rrs/tree/main/rrs-lib/src to `crates/toolchain/decoder`. - Everything in `extensions/rv32im/transpiler` has been copied to `extensions/rv64im/transpiler`.
Resolves INT-6044. Transpiler PRs should be reviewed in this order: #2446, #2447, #2448. Summary of this PR: - Code that was hardcoded for RV32 has been changed to RV64. - `rrs-lib` was removed as a dependency, and all crates now use `crates/toolchain/decoder`. - Some tests have been added to test the RV64 transpiler, but these tests are still messy -- this will be fixed in the ticket involving the integration tests for the full toolchain and interpreter flow.
…fix compilation errors (#2457) Resolves INT-6318. Summary: - The `extensions/rv32im` has been moved to `extensions/riscv`. - The old `extensions/rv32im/transpiler` module has been entirely deleted and replaced with the transpiler module created in #2448. - Many objects prefixed with `Rv32` are renamed to start with `Rv64`. A list of these objects can be found in https://docs.google.com/document/d/1mZTHk7n2LY-N1WMrjzn_BZmSYtP4PGnflH97Xjrs9EY/edit?tab=t.0.
Towards INT-6319. Summary: - `base_alu`, `less_than`, and `shift` submodules of `riscv` have been ported from RV32 to RV64 - All types/structs renamed from `Rv32*` to `Rv64*` (e.g., `Rv32BaseAluAdapterAir` → `Rv64BaseAluAdapterAir`) - Operand width changed from `u32`/4 bytes to `u64`/8 bytes throughout (registers, ALU ops, compose/decompose helpers) - `Rv64BaseAluAdapter` (shared by all three submodules) updated for RV64 - New constraints added to enforce bytes 3–7 all equal the sign byte for correct 64-bit sign extension (previously only byte 3 was constrained for 32-bit) - Added tests for the above constraint in `base_alu`, `less_than`, `shift`. - Changed constant `DEFAULT_U8_BLOCK_SIZE` from 4 to 8 (matching RV64 register width) - Commented out unported code in `extension/mod.rs` and `lib.rs` (load/store, branch, jal/jalr, auipc, mul/div — all marked `TEMP: disabled until ported to RV64`) with manual empty-enum trait impls to keep the crate compiling - Updated `test_utils.rs` to use RV64 conventions (`generate_rv64_is_type_immediate`, `rv64_rand_write_register_or_imm`) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…mul_w, divrem_w (#2483) Towards INT-6340, INT-6341, INT-6342, INT-6343. This PR copies the RV32 `base_alu`, `shift`, `mul`, and `divrem` modules into new `base_alu_w`, `shift_w`, `mul_w`, and `divrem_w` modules. These copies will serve as the starting point for implementing the RV64 32-bit word instruction variants. --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Ayush Shukla <shuklaayush@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Resolves INT-7523. Should be merged after #2728. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves INT-7524. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- also update elfs
) Switches guest dependencies that need atomic CAS (`once_cell::race` in the algebra/ecc macros, `spin::Mutex` in `guest-libs/keccak256`) to `portable-atomic` so they compile on stock `riscv64im-unknown-none-elf`, fixing CI failures across the algebra/ecc/pairing/ff_derive/k256/p256/keccak256 guest tests. The stock target's spec declares `atomic-cas: false`, so rustc cfg-gates `AtomicUsize::compare_exchange` out at the language level; `-C passes=lower-atomic` can't help since it acts on LLVM IR after rustc has already rejected the call. rc.1 avoided this on `riscv32im-risc0-zkvm-elf`, which claims CAS support in its spec; we keep guest programs on the stock upstream triple and patch the dependency instead. Switches `once_cell` (and `spin::Mutex` in `guest-libs/keccak256`) to their `portable-atomic` backend, with `portable-atomic` configured to use `critical-section` as its CAS fallback. `critical-section` is a minimal no_std mutual-exclusion interface: exactly one crate in the tree registers an `Impl` via `critical_section::set_impl!`, and every consumer calls `critical_section::with(...)`; without an impl the call panics at runtime. Off-the-shelf impls target interrupt-driven hardware and disable interrupts via CSR reads/writes, which the openvm VM can't transpile (they map to `unimp` and blow up at runtime). `openvm-platform` now registers a no-op impl, sound because the openvm guest is single-threaded, so CAS reduces to a plain load+store with zero CSR emission. A `SAFETY:` comment flags the single-thread invariant. Guest vs host gating now follows one rule across both code and manifests: - `.rs` files: `cfg(openvm_intrinsics)` (set by the guest build rustflags) for openvm-specific code paths, `cfg(not(feature = "std"))` for std-availability (`no_std`), `cfg(target_os = "none")` for runtime-specific attributes (`no_main`, `openvm::entry!` registration). The `no_std` / `no_main` split matches their semantics: std availability is a package decision, entry-point registration is a runtime/target property. - `Cargo.toml` manifests: `cfg(not(target_os = "none"))` for host-only dep gates (Cargo target predicates only accept built-in cfgs, not `openvm_intrinsics`). The `portable-atomic` CAS polyfill keeps the narrower `cfg(all(target_arch = "riscv64", target_os = "none"))` since the `critical-section` backend is arch-specific. - `--check-cfg=cfg(openvm_intrinsics)` added to guest-build rustflags in `crates/toolchain/build/src/lib.rs` so user guest programs never see the unexpected-cfgs warning. - `[workspace.lints.rust]` at the workspace root declares `openvm_intrinsics`; openvm-* crates inherit via `[lints] workspace = true`. Standalone sub-workspaces (tests/programs, book examples) can't inherit, so they declare it explicitly. - `resolver = "2"` on all standalone `[workspace]` sub-workspaces (tests/programs, book examples, cli/tests). Bare `[workspace]` defaulted to v1, which unified proc-macro features into guest builds and dragged `rand_core/std` and `getrandom/std` into the rv64 target. - Cargo.toml host-only dep gates swapped from the stale-on-rv64 `cfg(not(target_os = "zkvm"))` to `cfg(not(target_os = "none"))`. Three cascading rv64-port bugs, only exercised by the `final_exp_hint` tests: - `extensions/pairing/transpiler/src/lib.rs` used `RV32_REGISTER_NUM_LIMBS` (= 4) to compute the phantom's `a`/`b` operand addresses. On rv64 registers live at 8-byte boundaries in `RV64_REGISTER_AS`, so the phantom read zero. Changed to `RV64_REGISTER_NUM_LIMBS`. - `guest-libs/pairing/src/{bn254,bls12_381}/pairing.rs` constructed `(P.as_ptr() as u32, P.len() as u32)` as a fat pointer. The host phantom reads at offsets `0` and `RV64_REGISTER_NUM_LIMBS` (= 8), but the `(u32, u32)` tuple laid fields at offsets 0 and 4. Changed to `(u64, u64)`. - The adjacent `hint_buffer_chunked(ptr, (N * 12 * 2) / 4)` used the rv32 word-size divisor. Changed to `/ openvm_riscv_guest::HINT_WORD_BYTES`. `hint_store_u32!` to `hint_store_u64!` in `algebra/moduli-macros`. New `openvm_riscv_guest::HINT_WORD_BYTES = 8` replaces `/ 4` and `/ 8` magic numbers. Host-side `SqrtHintSubEx` padding bumped from `.take(4)` to `.take(RV64_REGISTER_NUM_LIMBS)` to match the 8-byte guest read. - `rv32`/`riscv32` to `rv64`/`riscv64` across SDK: `Sdk::riscv32()` and `AppConfig::riscv32()` become `riscv64()`, struct fields `rv32i`/`rv32m` become `rv64i`/`rv64m`, `openvm_riscv32.toml` renamed to `openvm_riscv64.toml` with matching TOML keys. Historical `v1.4/` and `v1.5/` snapshots untouched. - `Rv32Modular*`, `Rv32Weierstrass*`, `Rv32Pairing*` imports in failing guest-lib integration tests renamed to `Rv64*`. - `crates/sdk/guest/fib/src/main.rs`: `reveal_u32` to `reveal_u64`, `fibonacci` return type and locals widened from `u32` to `u64`. All 7 book examples were designed as host-run programs with `openvm/std` force-enabled, which dragged std into rv64 guest builds. Per-example: removed `features = ["std"]` from the `openvm` dep, moved `std` behind an example-local `std = ["openvm/std", ...]` feature, added `default-features = false` to transitive std-leakers (`serde`, `num-bigint`, `alloy-primitives`, `openvm-ruint`, `tiny-keccak`, `hex-literal`), and added the `no_main`/`no_std`/`openvm::entry!` guest-program scaffolding. The `i256` example additionally gets `bytes` with the `extra-platforms` feature and `portable-atomic` with `critical-section` on the riscv64-none target, since `alloy-primitives` transitively pulls `bytes` which uses `compare_exchange` directly. The test was passing `["std"]` to enable ff's `BatchInvert` trait via a `#[cfg(feature = "std")]` block in the guest program. On rv64 stock, activating `feature = "std"` at guest level chains through `openvm/std` and tries to link std, which riscv64im-unknown-none-elf doesn't have. Dropped `["std"]` from the test harness call; the std-only inner block is compiled out (the two `BatchInverter::invert_with_*_scratch` paths still run). Generated init file renamed from `openvm_init_batch_inversion_std.rs` to `openvm_init_batch_inversion.rs` to match the new build call. - Deleted `extensions/rv32im/` (empty pre-rv64 leftover) and an untracked stale `extensions/riscv/guest/guest/` copy with the old rv32 hint-store API. - Removed dead duplicate target-cfg block in `extensions/pairing/circuit/Cargo.toml`. - Fixed stale `cfg(target_os = "zkvm")` gate in `examples/keccak/Cargo.toml`. - Removed duplicate `critical_section::set_impl!` in `crates/toolchain/platform/src/heap/embedded.rs`; the canonical no-op impl now lives in `crates/toolchain/platform/src/critical_section.rs`.
Resolves INT-7566. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves INT-7563.
…, mulh, divrem, jal_lui, jalr, auipc (#2741) Resolves INT-7550. --------- Co-authored-by: Tuanlinh12312 <136139181+Tuanlinh12312@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves INT-7551.
Resolves INT-7553. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves INT-7552. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves INT-7564, INT-7528.
Resolves INT-7554. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves INT-7530.
3bd85a1 to
56e4f86
Compare
|
Claude finished @shuklaayush's task in 7m 29s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after 16 new commits since the last review (which covered up to
Previously reported bugs now FIXED (6 of 10):
CUDA Port Assessment — Well DoneThe CUDA port across 8 commits is comprehensive and correct:
Bugs (Still Open)1. AOT impls in riscv circuit reference non-existent
|
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
98, 186 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
112, 158 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
2. AOT impls for non-W riscv chips use wrong const generic — dead code (under aot feature)
The actual executor types use RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4) — these impls will never be dispatched:
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
21, 41 |
Note: The *_W chips' AOT impls also use RV32_REGISTER_NUM_LIMBS (4), but this happens to equal RV64_WORD_NUM_LIMBS (4) — functionally correct but semantically misleading.
3. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot feature)
12 impl blocks across 6 files implement traits for Rv32*256Executor types that don't exist (renamed to Rv64*256Executor):
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
4. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot feature)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but MulOpcode is not imported. Line 14 imports MulWOpcode from openvm_riscv_transpiler, and crate::common::* (line 19) doesn't re-export MulOpcode. This will fail to compile under the aot feature.
Fix: Add use openvm_riscv_transpiler::MulOpcode; to the #[cfg(feature = "aot")] imports, or add it to the main import block.
5. AOT layer assumes 32-bit registers throughout (Systemic — known issue)
The entire AOT x86 code generation layer is fundamentally broken for RV64:
- Register stride
a / 4hardcoded (e.g.,branch_eq/execution.rs:131-132,load_sign_extend/aot.rs:115) - XMM helpers
gpr_to_rv32_registerandxmm_to_gprincommon/mod.rs:25-85assume 2 RISC-V registers per XMM - Error messages in
update_adapter_heights_asm(common/mod.rs:108-116) say "RV32_REGISTER_AS" and "RV32_MEMORY_AS"
The CI disabling (chore(ci): comment out aot workflows) acknowledges this. The AOT layer needs a fundamental redesign for 64-bit register values since XMM elements are 32-bit.
Code Quality Issues
6. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:354-357:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {
data.map(|x| x.as_canonical_u32() as u8)
}Leftover temp function with debug marker — should be cleaned up or properly named before merge.
7. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. A 64-byte value allocates capacity for 64 u64 words (512 bytes) instead of 8 (64 bytes). Functionally correct but 8× memory waste.
Fix: Vec::with_capacity(core::mem::size_of_val(value) / 8) or just use Vec::new().
Architecture Observations (Positive)
-
CUDA port is comprehensive and clean. All 8 CUDA commits correctly migrate to RV64 types and constants. The new W-suffix GPU chips are properly registered. Zero
Rv32*type references remain in any CUDA-related code (.rs,.cu,.cuh). -
Guest program migration is thorough.
cfg(openvm_intrinsics)cleanly replacescfg(target_os = "zkvm")across 119 uses. All benchmark guests properly useno_std. Guest lib functions userv64prefix. -
feat: rv64 more cleanup (#2752)centralizestracing_read_reg_ptrinadapters/mod.rs:235-251with properpointer_max_bitsparameter and debug assertions, eliminating duplicate helpers fromriscv-adapters/src/helpers.rs. -
Non-AOT execution paths across all extensions (riscv, algebra, ECC, keccak256, sha2, bigint, deferral, pairing) are solid — consistent
RV64_*constant usage. -
*_Wadapter/core pattern remains well-designed with proper sign extension via(result as i32 as i64 as u64). -
Constants header (
primitives/constants.h) correctly defines both RV32 and RV64 constants for multi-ISA CUDA support, withDEFAULT_BLOCK_SIZE = 8.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: Riscv AOT Rv32*Executor → Rv64*Executor (4 files, 8 impls) |
Compilation failure under aot |
Trivial rename |
| High | #2: Riscv non-W AOT impls wrong const generic (4 files) | Dead code under aot |
Part of AOT port |
| High | #3: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files, 12 impls) |
Compilation failure under aot |
Trivial rename |
| High | #4: Missing MulOpcode import in mul_w AOT |
Compilation failure under aot |
Trivial fix |
| Medium | #5: AOT layer assumes 32-bit registers (systemic) | Entire AOT path broken | Known deferral (CI disabled) |
| Low | #6: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #7: Serializer 8× over-allocation | Wasted memory | Minor |
Progress Since Last Review
Major improvements — 6 of 10 previously reported bugs have been fixed:
- ✅
config.rv32i→config.rv64i(5 locations) — Fixed - ✅ CUDA extension old
Rv32*types (all extensions) — Fully ported across 8 commits - ✅ Deferral extension soundness issues — Fixed
- ✅ Pairing transpiler wrong stride — Fixed
- ✅ Duplicate
guest/guest/directory — Removed - ✅ Guest lib
cfg(target_os = "zkvm")→cfg(openvm_intrinsics)— Fixed
The core RV64 architecture (non-AOT paths) is solid and complete. The CUDA port is comprehensive. The only remaining issues are in the AOT layer, which is acknowledged as deferred with CI disabled.
Recommended Before Merge
- Fix
Rv32*Executor→Rv64*Executorin riscv AOT impls (4 files, 8 occurrences) — trivial renames - Fix
Rv32*256Executor→Rv64*256Executorin bigint AOT impls (6 files, 12 occurrences) — trivial renames - Add missing
MulOpcodeimport inmul_w/execution.rs— 1 line - Remove
TEMP[jpw]marker / renametmp_convert_to_u8sinadapters/mod.rs:354 - Track systemic AOT redesign as a separate work item
|
Claude finished @shuklaayush's task in 6m 22s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after 15 new commits since the last successful review. The PR is at 44 commits, 100 changed files, +20882/−10689 lines. Major new work:
Previously critical bugs now FIXED:
Bugs (Still Open)1. AOT impls reference non-existent
|
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
98, 186 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
112, 158 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
These are dead code — the impls apply to types that don't exist. Will fail to compile when aot feature is enabled.
2. AOT impls for non-W chips use wrong const generic RV32_REGISTER_NUM_LIMBS (Dead code under aot)
The actual executor types use RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4) — making these impls dead code that never gets dispatched:
| File | Lines | Const generic |
|---|---|---|
base_alu/execution.rs |
177, 245 | { RV32_REGISTER_NUM_LIMBS } → should be { RV64_REGISTER_NUM_LIMBS } |
less_than/execution.rs |
128, 247 | same |
divrem/execution.rs |
108, 273 | same |
load_sign_extend/aot.rs |
21, 41 | same |
Additionally, base_alu_w/execution.rs lines 166, 234 use RV32_REGISTER_NUM_LIMBS (4) — for W-suffix chips, this happens to equal RV64_WORD_NUM_LIMBS (4), so it's functionally correct but should use RV64_WORD_NUM_LIMBS for semantic clarity.
3. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
4. AOT common/mod.rs helper functions assume 32-bit register layout (Systemic AOT issue)
extensions/riscv/circuit/src/common/mod.rs:16 imports RV32_MEMORY_AS, RV32_REGISTER_AS and the helper functions (gpr_to_rv32_register, xmm_to_gpr, update_adapter_heights_asm) are all designed for the 32-bit register-to-XMM mapping (2 registers per XMM via pinsrd/pextrd). For RV64, registers are 64 bits each and would need 1:1 XMM mapping or use pinsrq/pextrq.
The error messages at lines 107-117 also say "RV32_REGISTER_AS" and "RV32_MEMORY_AS".
Similarly, AOT code throughout the loadstore/aot.rs, branch_lt/aot.rs, branch_eq/execution.rs, and all *_W execution files use RV32_REGISTER_AS and RV32_IMM_AS — functionally correct (same numeric values as RV64_* equivalents) but semantically misleading.
This is a known deferral — AOT CI is disabled via TODO(aot): re-enable comments in .github/workflows/extension-tests.yml.
Code Quality Issues
5. TEMP[jpw] marker in production code
extensions/riscv/circuit/src/adapters/mod.rs:354:
// TEMP[jpw]
pub fn tmp_convert_to_u8s<F: PrimeField32, const N: usize>(data: [F; N]) -> [u8; N] {Leftover temp function with debug marker — should be cleaned up or properly named.
6. Serializer Vec::with_capacity over-allocates 8×
crates/toolchain/openvm/src/serde/serializer.rs:57:
let mut vec: Vec<u64> = Vec::with_capacity(core::mem::size_of_val(value));size_of_val returns bytes, but with_capacity takes number of u64 elements. A 64-byte value allocates capacity for 64 u64 words (512 bytes) instead of 8 (64 bytes). Functionally correct but 8× memory waste.
Fix: Vec::with_capacity(core::mem::size_of_val(value) / core::mem::size_of::<u64>()) or Vec::new().
7. Hardcoded 8 instead of WORD_SIZE in io/mod.rs
crates/toolchain/openvm/src/io/mod.rs:60-61:
let num_words = len.div_ceil(8);
let capacity = num_words * 8;WORD_SIZE is defined as 8 in platform/src/lib.rs. Should use the constant for consistency with the rest of the codebase.
8. Rv32ModularArithmeticOpcode enum not renamed
extensions/algebra/transpiler/src/lib.rs:21 defines pub enum Rv32ModularArithmeticOpcode — this is referenced in ~30 locations across algebra/circuit/src/ (preflight.rs, execution.rs, modular.rs, hybrid.rs, muldiv.rs, addsub.rs). Functionally correct since it's just an enum name, but inconsistent with the rest of the Rv32 → Rv64 rename.
9. Test error message strings still say "Rv32IM"
4 test files reference "failed to create Rv32IM executor" in their expect() strings:
mul/tests.rs:290,mulh/tests.rs:531,jalr/tests.rs:591,mul_w/tests.rs:417
Cosmetic only — these are behind aot/cuda feature gates.
CUDA Port Assessment (New)
The CUDA port across 8 commits is well done:
- ✅
extensions/riscv/circuit/src/extension/cuda.rs— correctly uses allRv64*types, noRv32*imports - ✅ All non-W chip CUDA files (
base_alu/cuda.rs,shift/cuda.rs,less_than/cuda.rs,mul/cuda.rs,mulh/cuda.rs,divrem/cuda.rs) — correctly useRV64_REGISTER_NUM_LIMBSandRV64_CELL_BITS - ✅ All W-suffix chip CUDA files (
base_alu_w/cuda.rs,shift_w/cuda.rs,mul_w/cuda.rs,divrem_w/cuda.rs) — correctly useRV64_WORD_NUM_LIMBSandRV64_CELL_BITS - ✅
bigint/circuit/src/extension/cuda.rs— correctly usesRv64*256Airtypes - ✅
crates/circuits/primitives/cuda/include/primitives/constants.h— correctly defines both RV32 and RV64 constants withRV64_REGISTER_NUM_LIMBS = 8,RV64_WORD_NUM_LIMBS = 4,RV64_CELL_BITS = 8 - ✅ SHA2, hintstore, loadstore, load_sign_extend CUDA files all correctly ported
- ✅ Keccak256 and SHA2 extensions — zero remaining
RV32_*references in Rust code
Architecture Observations (Positive)
-
Non-AOT, non-CUDA paths across ALL extensions (riscv, algebra, ECC, pairing, keccak256, sha2, bigint, deferral) are solid — consistent
RV64_*constant usage throughout. -
CUDA port follows the correct pattern for all chip types: non-W chips use
RV64_REGISTER_NUM_LIMBS(8), W-suffix chips useRV64_WORD_NUM_LIMBS(4). -
Guest program migration to
no_stdwithopenvm_intrinsicscfg and portable atomics is clean and thorough. -
*_Wadapter/core pattern remains well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension correctly. -
Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment,openvm_intrinsicscfg flag cleanly replaces oldtarget_os = "zkvm".
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: Riscv AOT Rv32*Executor → Rv64*Executor (4 files, 8 impls) |
Dead code + compilation failure under aot |
Trivial rename |
| High | #2: Riscv AOT impls wrong const generic (4 files, 8 impls) | Dead code under aot |
Part of AOT port |
| High | #3: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files, 12 impls) |
Dead code + compilation failure under aot |
Trivial rename |
| Medium | #4: AOT common/mod.rs helpers assume 32-bit layout |
AOT broken for RV64 | Known deferral (CI disabled) |
| Low | #5: TEMP[jpw] marker |
Cleanup | Minor |
| Low | #6: Serializer 8× over-allocation | Wasted memory | Minor |
| Low | #7: Hardcoded 8 vs WORD_SIZE |
Style inconsistency | Minor |
| Low | #8: Rv32ModularArithmeticOpcode not renamed |
Cosmetic | Minor |
| Low | #9: Test error message strings say "Rv32IM" | Cosmetic | Minor |
Progress Since Previous Reviews
Major progress — 6 of the 10 previously tracked bugs are fixed, including all critical/high-severity ones:
- ✅ All critical soundness bugs (deferral AIR, deferral transpiler, pairing transpiler) — fixed
- ✅
config.rv32icompilation issue — fixed - ✅ CUDA extension port — complete (previously flagged as "known deferral")
- ✅ Duplicate
guest/guest/directory — removed - ✅ Deferral config naming — fixed
The remaining open items are all confined to the AOT layer, which is acknowledged as not yet ported (CI disabled with TODO(aot): re-enable).
Recommended Before Merge
- Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files, 8 impls) and bigint AOT (6 files, 12 impls) — trivial renames that prevent compilation errors even though AOT CI is disabled - Remove
TEMP[jpw]marker / renametmp_convert_to_u8s - Track AOT port as a separate work item (the systemic
common/mod.rshelpers + const generic issues need a coordinated fix)
# fix(v2.1-rv64): convert more rv32 references to rv64
Follow-up cleanup that gets the lint job and the `sha2_bench`
bench-prove run green on the rv64 branch, plus a few non-AOT polish
items called out in review.
## Lint fixes
- `extensions/riscv/circuit/src/auipc/tests.rs` — gate
`RV64_REGISTER_NUM_LIMBS` behind `#[cfg(feature = "cuda")]`; it's only
used inside the CUDA test, so the CPU lint job hit `unused_imports`.
- `crates/continuations/src/tests/e2e.rs` — rename stale
`Rv32DeferralBuilder` / `Rv32DeferralConfig` → `Rv64DeferralBuilder` /
`Rv64DeferralConfig` (3 sites). Fixes the CUDA lint job's `E0432`.
- `crates/sdk/src/tests.rs` — `SdkVmConfig::riscv32()` → `::riscv64()`
plus doc-comment.
- `benchmarks/prove/src/lib.rs` — drop unused `clap::command` import.
- `extensions/riscv/circuit/src/{mul,mul_w,jalr,mulh}/tests.rs` — error
string `"failed to create Rv32IM executor"` → `"Rv64IM executor"`.
## Rename
- `Rv32ModularArithmeticOpcode` → `Rv64ModularArithmeticOpcode` (81
references across `extensions/algebra/{transpiler,circuit}`). Pure
rename, no semantic change.
## Cleanup
- Remove `tmp_convert_to_u8s` and its `// TEMP[jpw]` marker from
`extensions/riscv/circuit/src/adapters/mod.rs` — `pub fn` with zero
in-tree callers.
- `crates/toolchain/openvm/src/io/mod.rs` — replace hardcoded `8` in
`read_vec_by_len` with `openvm_platform::WORD_SIZE`.
## Bug fix
- `crates/toolchain/openvm/src/serde/serializer.rs` — `to_vec` was
calling `Vec::<u64>::with_capacity(size_of_val(value))`, conflating
bytes with u64 elements and over-allocating 8×. Switched to
`size_of_val(value).div_ceil(size_of::<u64>())`. Heap waste only, no
correctness impact.
## Benchmark ELF rebuilds
Rebuilt 16 benchmark guest ELFs as 64-bit RISC-V (the committed binaries
were still 32-bit, which made `sha2_bench` fail with `Error: Not a
64-bit ELF` after the transpiler tightened its check): base64_json,
bincode, bubblesort, ecrecover, fibonacci, fibonacci_iterative,
fibonacci_recursive, keccak256, keccak256_iter, kitchen-sink, pairing,
quicksort, rkyv, sha2_bench, sha256, sha256_iter.
Not rebuilt — `regex`, `revm_transfer`, `revm_snailtracer` — their
dependency closures (`regex-automata`, `tracing-core 0.1.x`) use atomic
CAS methods that the `riscv64im-unknown-none-elf` target doesn't expose.
The existing `portable-atomic` polyfill in `openvm-platform` works for
crates that opt in via a feature flag (`bytes`, `spin`), but neither
failing crate exposes one. Tracking as a separate item.
## Verification
- `cargo +nightly fmt --all -- --check` clean.
- `cargo clippy --workspace --all-targets --tests --features "<CI
feature set>" -- -D warnings` clean.
- `cargo check --workspace --all-targets --tests` clean.
|
Claude finished @shuklaayush's task in 4m 42s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously reported bugs now FIXED in this commit:
Previously confirmed fixed (earlier commits):
Bugs (Still Open — All AOT-Only)All remaining bugs are confined to the AOT layer, which is acknowledged as not yet ported (CI disabled with 1. AOT impls reference non-existent
|
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
98, 186 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
112, 158 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
2. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot feature)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
3. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot feature)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but MulOpcode is not imported. Neither the direct imports (line 14 imports MulWOpcode only) nor crate::common::* (line 19) provide MulOpcode. Will fail to compile under aot feature.
Fix: Add use openvm_riscv_transpiler::MulOpcode; to the #[cfg(feature = "aot")] imports block.
4. AOT impls for non-W chips use wrong const generic — dead code (under aot feature)
The actual executor types use RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4):
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
21, 41 |
These impls implement traits for types that are never instantiated with RV32_REGISTER_NUM_LIMBS.
5. AOT layer assumes 32-bit register layout throughout (Systemic — known deferral)
The entire AOT x86 code generation layer uses 4-limb register stride (a / 4), 32-bit XMM helpers (gpr_to_rv32_register, xmm_to_gpr in common/mod.rs:25-85), and RV32_REGISTER_AS/RV32_IMM_AS constants. Error messages in update_adapter_heights_asm also say "RV32_REGISTER_AS". The AOT layer needs a fundamental redesign for 64-bit register values since XMM elements are 32-bit (pinsrd/pextrd).
CI is properly disabled via TODO(aot): re-enable in .github/workflows/extension-tests.yml.
Architecture Observations (Positive)
-
Latest commit (e9af982) is a thorough cleanup that addresses many previously reported issues — serializer over-allocation, hardcoded constants, opcode enum naming, test strings, SDK/continuations deferral references. Good work.
-
Non-AOT code paths are clean. Zero
RV32_REGISTER_NUM_LIMBSreferences remain outside of AOTimplblocks and theadapters/mod.rsimport line (needed for theRV32_CELL_BITS == RV64_CELL_BITSassertion). All extensions (riscv, algebra, ECC, pairing, keccak256, sha2, bigint, deferral) use correctRV64_*constants. -
CUDA port is complete. All chip types correctly reference
Rv64*AIR types and useRV64_REGISTER_NUM_LIMBS/RV64_WORD_NUM_LIMBSconsistently. -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE,io/mod.rsusesWORD_SIZE, guestopenvm_intrinsicscfg flag is cleanly integrated, CLI init templates useu64/reveal_u64. -
Test improvements in
loadstore/tests.rs— the CUDA test now usespointer_max_bits: 20with properly sized address spaces, andgen_pointeris replaced with explicit address alignment logic that respectstester.address_bits().
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: Riscv AOT Rv32*Executor → Rv64*Executor (4 files, 8 impls) |
Compilation failure under aot |
Trivial rename |
| High | #2: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files, 12 impls) |
Compilation failure under aot |
Trivial rename |
| Medium | #3: Missing MulOpcode import in mul_w AOT |
Compilation failure under aot |
1-line fix |
| Medium | #4: Non-W AOT impls wrong const generic (4 files) | Dead code under aot |
Part of AOT port |
| Medium | #5: AOT layer assumes 32-bit register layout (systemic) | Entire AOT path broken | Known deferral (CI disabled) |
Progress Since Previous Reviews
Excellent progress — all previously reported critical and high-severity bugs in the core execution paths have been fixed. The latest commit cleans up many of the remaining code quality issues. The only open items are confined to the AOT layer, which is explicitly marked as not yet ported with CI disabled.
Recommended Before Merge
- Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files, 8 impls) and bigint AOT (6 files, 12 impls) — trivial renames that prevent compilation errors if someone enables theaotfeature - Add missing
MulOpcodeimport inmul_w/execution.rs— 1 line - Track systemic AOT port as a separate work item
Resolves INT-6048. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Claude finished @shuklaayush's task in 6m 13s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously reported bugs now FIXED (confirmed):
New Commit Assessment: riscv-test-vectors migration (#2758) — ✅ CleanThe migration from
Bugs (Still Open — All AOT-Only)All remaining bugs are confined to the AOT layer, which is acknowledged as not yet ported (CI disabled with 1. AOT impls reference non-existent
|
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
98, 186 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
112, 158 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
2. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot feature)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
3. AOT impls for non-W riscv chips use wrong const generic — dead code (under aot feature)
The actual executor types use RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4) — making these impls dead code that never gets dispatched:
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
21, 41 |
Note: The *_W chips' AOT impls also use RV32_REGISTER_NUM_LIMBS (4), which happens to equal RV64_WORD_NUM_LIMBS (4) — functionally correct but semantically misleading.
4. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot feature)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but only imports MulWOpcode (line 14). MulOpcode is not available via crate::common::* either.
Fix: Add use openvm_riscv_transpiler::MulOpcode; to the #[cfg(feature = "aot")] imports.
5. AOT layer assumes 32-bit register layout throughout (Systemic — known deferral)
The AOT x86 code generation layer is fundamentally designed for 32-bit registers:
- Register stride
a / 4hardcoded across all AOT files - XMM helpers (
gpr_to_rv32_register,xmm_to_gprincommon/mod.rs) assume 2 RISC-V registers per 64-bit XMM - Error messages say "RV32_REGISTER_AS" and "RV32_MEMORY_AS"
CI is disabled via TODO(aot): re-enable in .github/workflows/extension-tests.yml. The AOT layer needs a fundamental redesign for 64-bit register values.
Code Quality Issues
6. Transpiler tests use 32-bit ELFs that will be rejected by the RV64 ELF loader
crates/toolchain/tests/tests/transpiler_tests.rs has 6 non-#[ignore] tests that load rv32im-* ELF files (confirmed 32-bit via file command), but the ELF loader at elf.rs:81 rejects any non-ELF64 input with "Not a 64-bit ELF":
| Test function | ELF file | Feature gate |
|---|---|---|
test_decode_elf |
rv32im-empty-program-elf |
None |
test_generate_program |
rv32im-fib-from-as, rv32im-intrin-from-as |
None |
test_rv32im_runtime |
rv32im-exp-from-as, rv32im-fib-from-as |
None |
test_intrinsic_runtime |
rv32im-intrin-from-as |
None |
test_terminate_prove |
rv32im-terminate-from-as |
None |
test_rv32im_aot_pure_runtime |
rv32im-exp-from-as |
aot |
These tests will fail at runtime if executed. They're not run in CI (the riscv workflow only runs the rv64im_riscv_vector_* tests), but they'll fail for anyone running cargo nextest run -p openvm-toolchain-tests. They should either be updated with equivalent rv64 ELFs, #[ignore]d with a TODO(rv64) comment, or removed.
7. Rv32ModularFp2Int256Config struct not renamed
crates/toolchain/tests/tests/transpiler_tests.rs:137 defines pub struct Rv32ModularFp2Int256Config with Rv64I, Rv64M, Rv64Io fields. Cosmetic but inconsistent with the rest of the migration.
8. CUDA library name still says rv32im
extensions/riscv/circuit/build.rs:19 — .library_name("tracegen_gpu_rv32im"). Functionally correct (it's just an internal linker name) but inconsistent with the rv32im → riscv rename.
Architecture Observations (Positive)
-
Non-AOT code paths across ALL extensions (riscv, algebra, ECC, pairing, keccak256, sha2, bigint, deferral) are solid — consistent
RV64_*constant usage. Zero non-AOTRV32_REGISTER_NUM_LIMBSreferences outside of the constants re-export. -
CUDA port is comprehensive and correct. All chip types reference
Rv64*AIR types. Theprimitives/constants.hheader correctly definesRV64_REGISTER_NUM_LIMBS = 8,RV64_WORD_NUM_LIMBS = 4,DEFAULT_BLOCK_SIZE = 8. -
riscv-test-vectors migration is clean — Makefile targets
rv64g, test header usesRVTEST_RV64U, shipped binary is confirmed 64-bit ELF, CI workflow correctly references new paths and test names. -
Guest program migration is thorough —
cfg(openvm_intrinsics)cleanly replacescfg(target_os = "zkvm"), all benchmarks useno_std, guest lib functions userv64prefix. -
ELF loader correctly validates
Class::ELF64, usestry_into()for u64→u32 conversions with error handling, and constrains addresses to u32. -
Decoder crate properly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32). -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension via(result as i32 as i64 as u64). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment,openvm_intrinsicscfg flag cleanly replacestarget_os = "zkvm".
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: Riscv AOT Rv32*Executor → Rv64*Executor (4 files, 8 impls) |
Compilation failure under aot |
Trivial rename |
| High | #2: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files, 12 impls) |
Compilation failure under aot |
Trivial rename |
| Medium | #3: Non-W AOT impls wrong const generic (4 files) | Dead code under aot |
Part of AOT port |
| Medium | #4: Missing MulOpcode import in mul_w AOT |
Compilation failure under aot |
1-line fix |
| Medium | #5: AOT layer assumes 32-bit registers (systemic) | Entire AOT path broken | Known deferral (CI disabled) |
| Low | #6: Transpiler tests use 32-bit ELFs | Tests broken if run locally | Should #[ignore] or remove |
| Low | #7: Rv32ModularFp2Int256Config not renamed |
Cosmetic | Minor |
| Low | #8: CUDA library name tracegen_gpu_rv32im |
Cosmetic | Minor |
Progress Since Previous Reviews
Excellent progress — all previously reported critical and high-severity bugs in the core execution/circuit/CUDA paths have been fixed. 19 of the originally reported issues are now resolved. The only remaining open items are confined to the AOT layer, which is explicitly marked as not yet ported with CI disabled.
Recommended Before Merge
- Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files, 8 impls) and bigint AOT (6 files, 12 impls) — trivial renames that prevent compilation errors if someone enables theaotfeature - Add missing
MulOpcodeimport inmul_w/execution.rs— 1 line #[ignore]or remove the 6 transpiler tests that load 32-bit ELFs- Track systemic AOT redesign as a separate work item
Towards INT-6339. AUIPC was zero-extending `pc + (imm << 12)` into the upper 32 bits of rd, but the RV64 spec requires sign extension, causing riscv test `rv64ui-p-auipc` to fail. Add `is_sign_extend` column to constrain the MSB of the 32-bit result and fill the upper 4 bytes accordingly. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Ayush Shukla <ayush@axiom.xyz>
|
Claude finished @shuklaayush's task in 4m 57s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously reported bugs now FIXED (confirmed):
New Commit Assessment: AUIPC sign-extend (#2759) — ✅ SoundThe fix correctly sign-extends the AUIPC result from 32 bits to 64 bits. In RV64, AIR constraint approach is elegant and sound:
CUDA consistency verified — Test coverage is thorough — new JalLui already had the same pattern applied (confirmed in JALR correctly uses zero-extension ( Bugs (Still Open — All AOT-Only)All remaining bugs are confined to the AOT layer, which is acknowledged as not yet ported (CI disabled with 1. AOT impls reference non-existent
|
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
98, 186 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
112, 158 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
2. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot feature)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
3. AOT impls for non-W riscv chips use wrong const generic — dead code (under aot feature)
The actual executor types use RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4) — dead code that never gets dispatched:
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
21, 41 |
4. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot feature)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but MulOpcode is not imported anywhere in the file. Neither the direct imports (line 14 imports only MulWOpcode) nor crate::common::* (line 19) provide MulOpcode.
Fix: Add use openvm_riscv_transpiler::MulOpcode; to the #[cfg(feature = "aot")] imports block.
5. AOT layer assumes 32-bit register layout throughout (Systemic — known deferral)
The entire AOT x86 code generation layer uses 4-limb register stride (a / 4), 32-bit XMM helpers (gpr_to_rv32_register, xmm_to_gpr in common/mod.rs:25-85), and RV32_REGISTER_AS/RV32_IMM_AS constants. CI is disabled via TODO(aot): re-enable.
Code Quality Issues
6. Transpiler tests use 32-bit ELFs that will be rejected by the RV64 ELF loader
crates/toolchain/tests/tests/transpiler_tests.rs has 5 non-feature-gated tests that load rv32im-* ELF files (32-bit), but the ELF loader at elf.rs:81 rejects any non-ELF64 input with "Not a 64-bit ELF":
| Test | ELF |
|---|---|
test_decode_elf |
rv32im-empty-program-elf |
test_generate_program |
rv32im-fib-from-as, rv32im-intrin-from-as |
test_rv32im_runtime |
rv32im-exp-from-as, rv32im-fib-from-as |
test_intrinsic_runtime |
rv32im-intrin-from-as |
test_terminate_prove |
rv32im-terminate-from-as |
These won't fail in CI (which only runs test_rv64im_riscv_vector_*), but will fail for anyone running cargo nextest run -p openvm-toolchain-tests. They should either be updated with RV64 equivalents, #[ignore]d with a TODO(rv64) comment, or removed.
7. CUDA library name still says rv32im
extensions/riscv/circuit/build.rs:19 — .library_name("tracegen_gpu_rv32im"). Functionally correct (internal linker name) but inconsistent with the rv32im → riscv rename.
Architecture Observations (Positive)
-
AUIPC sign-extend fix is well-designed. The range-check-based approach to constrain
is_sign_extend = msb(rd_data[3])is efficient (no extra columns beyond the boolean flag) and sound. Both Rust and CUDA trace generation are consistent. -
JalLui sign-extend uses the same pattern with an additional
rd[3] * (4 * is_jal + is_lui)range check that bounds the JAL result topc < 2^30— clean and correct. -
Non-AOT code paths across ALL extensions (riscv, algebra, ECC, pairing, keccak256, sha2, bigint, deferral) are solid — consistent
RV64_*constant usage. Zero non-AOTRV32_REGISTER_NUM_LIMBSreferences outside of theadapters/mod.rsassertion. -
CUDA port is comprehensive and correct. All chip types reference
Rv64*AIR types with properRV64_REGISTER_NUM_LIMBS/RV64_WORD_NUM_LIMBSusage. The AUIPC CUDA trace generation correctly mirrors the Rust sign-extend logic. -
*_Wadapter/core pattern remains well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension via(result as i32 as i64 as u64). -
Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment,openvm_intrinsicscfg cleanly replacestarget_os = "zkvm", ELF64 validation,riscv64im-unknown-none-elftarget.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: Riscv AOT Rv32*Executor → Rv64*Executor (4 files, 8 impls) |
Compilation failure under aot |
Trivial rename |
| High | #2: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files, 12 impls) |
Compilation failure under aot |
Trivial rename |
| Medium | #3: Non-W AOT impls wrong const generic (4 files) | Dead code under aot |
Part of AOT port |
| Medium | #4: Missing MulOpcode import in mul_w AOT |
Compilation failure under aot |
1-line fix |
| Medium | #5: AOT layer assumes 32-bit registers (systemic) | Entire AOT path broken | Known deferral (CI disabled) |
| Low | #6: Transpiler tests use 32-bit ELFs | Tests broken if run locally | Should #[ignore] or update |
| Low | #7: CUDA library name tracegen_gpu_rv32im |
Cosmetic | Minor |
Progress Since Previous Reviews
All previously reported critical and high-severity bugs in the core execution/circuit/CUDA paths have been fixed (19 issues resolved). The latest AUIPC sign-extend fix is sound and well-tested. The only remaining open items are confined to the AOT layer, which is acknowledged as not yet ported with CI disabled.
Recommended Before Merge
- Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files, 8 impls) and bigint AOT (6 files, 12 impls) — trivial renames that prevent compilation errors if someone enables theaotfeature - Add missing
MulOpcodeimport inmul_w/execution.rs— 1 line #[ignore]or remove the 5 transpiler tests that load 32-bit ELFs — prevents confusion for local developers- Track systemic AOT port as a separate work item
|
Claude finished @shuklaayush's task in 4m 27s —— View job Code Review: RV64 Support PR (Re-review after latest sync)
OverviewRe-reviewing after the latest commit Previously reported bugs now FIXED (confirmed):
New Commit Assessment: Custom Target JSON with Atomic CAS (#2760) — ✅ Well DoneThis commit replaces the stock Benefits are significant:
Target JSON spec is correct:
Build integration is correct:
One minor observation: The Bugs (Still Open — All AOT-Only)All remaining bugs are confined to the AOT layer, which is acknowledged as not yet ported (CI disabled with 1. AOT impls reference non-existent
|
| File | Lines | Has | Should be |
|---|---|---|---|
riscv/circuit/src/jal_lui/execution.rs |
93, 184 | Rv32JalLuiExecutor |
Rv64JalLuiExecutor |
riscv/circuit/src/auipc/execution.rs |
84, 167 | Rv32AuipcExecutor |
Rv64AuipcExecutor |
riscv/circuit/src/jalr/execution.rs |
98, 186 | Rv32JalrExecutor |
Rv64JalrExecutor |
riscv/circuit/src/hintstore/execution.rs |
112, 158 | Rv32HintStoreExecutor |
Rv64HintStoreExecutor |
2. Bigint AOT impls reference non-existent Rv32*256Executor types (Compilation error under aot feature)
12 impl blocks across 6 files:
| File | Lines | Has | Should be |
|---|---|---|---|
bigint/circuit/src/base_alu.rs |
88, 132 | Rv32BaseAlu256Executor |
Rv64BaseAlu256Executor |
bigint/circuit/src/branch_eq.rs |
81, 124 | Rv32BranchEqual256Executor |
Rv64BranchEqual256Executor |
bigint/circuit/src/branch_lt.rs |
86, 129 | Rv32BranchLessThan256Executor |
Rv64BranchLessThan256Executor |
bigint/circuit/src/less_than.rs |
84, 127 | Rv32LessThan256Executor |
Rv64LessThan256Executor |
bigint/circuit/src/mult.rs |
72, 115 | Rv32Multiplication256Executor |
Rv64Multiplication256Executor |
bigint/circuit/src/shift.rs |
85, 128 | Rv32Shift256Executor |
Rv64Shift256Executor |
3. AOT impls for non-W riscv chips use wrong const generic — dead code (under aot feature)
The actual executor types use RV64_REGISTER_NUM_LIMBS (8), but AOT impl blocks use RV32_REGISTER_NUM_LIMBS (4) — dead code that never gets dispatched:
| File | Lines |
|---|---|
base_alu/execution.rs |
177, 245 |
less_than/execution.rs |
128, 247 |
divrem/execution.rs |
108, 273 |
load_sign_extend/aot.rs |
21, 41 |
4. Missing MulOpcode import in mul_w/execution.rs (Compilation error under aot feature)
mul_w/execution.rs:98 uses MulOpcode::MUL.global_opcode() but MulOpcode is not imported anywhere in the file. Will fail to compile under aot feature.
Fix: Add use openvm_riscv_transpiler::MulOpcode; to the #[cfg(feature = "aot")] imports block.
5. AOT layer assumes 32-bit register layout throughout (Systemic — known deferral)
The entire AOT x86 code generation layer uses 4-limb register stride (a / 4), 32-bit XMM helpers (gpr_to_rv32_register, xmm_to_gpr), and RV32_REGISTER_AS/RV32_IMM_AS constants. CI is disabled via TODO(aot): re-enable. Needs fundamental redesign for 64-bit register values.
Code Quality Issues
6. Transpiler tests use 32-bit ELFs that will be rejected by the RV64 ELF loader
crates/toolchain/tests/tests/transpiler_tests.rs has 5 non-feature-gated tests that load rv32im-* ELF files (32-bit), but the ELF loader rejects non-ELF64 with "Not a 64-bit ELF":
| Test | ELF |
|---|---|
test_decode_elf |
rv32im-empty-program-elf |
test_generate_program |
rv32im-fib-from-as, rv32im-intrin-from-as |
test_rv32im_runtime |
rv32im-exp-from-as, rv32im-fib-from-as |
test_intrinsic_runtime |
rv32im-intrin-from-as |
test_terminate_prove |
rv32im-terminate-from-as |
These won't fail in CI (which only runs test_rv64im_riscv_vector_*), but will fail for local developers running cargo nextest run -p openvm-toolchain-tests. They should be #[ignore]d with a TODO(rv64) comment or updated with RV64 equivalents.
7. CUDA library name still says rv32im
extensions/riscv/circuit/build.rs:19 — .library_name("tracegen_gpu_rv32im"). Functionally correct (internal linker name) but inconsistent with the rv32im → riscv rename.
Architecture Observations (Positive)
-
Custom target JSON approach is elegant — declaring
"atomic-cas": trueat the target level combined withlower-atomicat the LLVM pass level eliminates the need forportable-atomicacross the entire dependency graph. This is the cleanest solution for single-threaded no-std targets needing CAS-dependent crates. -
Non-AOT code paths across ALL extensions (riscv, algebra, ECC, pairing, keccak256, sha2, bigint, deferral) are solid — consistent
RV64_*constant usage. Zero non-AOTRV32_REGISTER_NUM_LIMBSreferences outside of the adapters re-export. -
CUDA port is comprehensive and correct. All chip types reference
Rv64*AIR types with properRV64_REGISTER_NUM_LIMBS/RV64_WORD_NUM_LIMBSusage. ZeroRv32*references in CUDA code. -
Critical section impl correctly co-located with
embedded-allocinheap/embedded.rs— only compiled when the heap feature is active, eliminating unnecessary compilation ofcritical-sectionfor non-heap guest programs. -
*_Wadapter/core pattern is well-designed — adapters read full 8-byte registers, expose lower 4 bytes to cores, handle sign extension via(result as i32 as i64 as u64). -
Decoder crate correctly handles RV64 shift amounts (6-bit
shamtviafunct6) and W-suffix validation (shamt < 32). -
Toolchain migration is thorough — serializer/deserializer use
WORD_SIZE=8, platform uses 8-byte alignment,openvm_intrinsicscfg flag cleanly replacestarget_os = "zkvm",riscv64im-openvm-none-elfcustom target with proper atomic lowering. -
AUIPC/JalLui sign-extend is sound — range-check-based constraint on
is_sign_extendboolean, with matching CUDA trace generation.
Summary
| Priority | Issue | Impact | Status |
|---|---|---|---|
| High | #1: Riscv AOT Rv32*Executor → Rv64*Executor (4 files, 8 impls) |
Compilation failure under aot |
Trivial rename |
| High | #2: Bigint AOT Rv32*256Executor → Rv64*256Executor (6 files, 12 impls) |
Compilation failure under aot |
Trivial rename |
| Medium | #3: Non-W AOT impls wrong const generic (4 files) | Dead code under aot |
Part of AOT port |
| Medium | #4: Missing MulOpcode import in mul_w AOT |
Compilation failure under aot |
1-line fix |
| Medium | #5: AOT layer assumes 32-bit registers (systemic) | Entire AOT path broken | Known deferral (CI disabled) |
| Low | #6: Transpiler tests use 32-bit ELFs | Broken if run locally | Should #[ignore] or update |
| Low | #7: CUDA library name tracegen_gpu_rv32im |
Cosmetic | Minor |
Progress Since Previous Reviews
All previously reported critical and high-severity bugs in the core execution/circuit/CUDA paths have been fixed (19+ issues resolved across the review cycle). The latest commit cleanly eliminates portable-atomic complexity via a well-designed custom target JSON. The only remaining open items are confined to the AOT layer, which is explicitly acknowledged as not yet ported with CI disabled.
Recommended Before Merge
- Fix
Rv32*Executor→Rv64*Executorin riscv AOT (4 files, 8 impls) and bigint AOT (6 files, 12 impls) — trivial renames that prevent compilation errors if someone enables theaotfeature - Add missing
MulOpcodeimport inmul_w/execution.rs— 1 line #[ignore]or remove the 5 transpiler tests that load 32-bit ELFs — prevents confusion for local developers- Track systemic AOT port as a separate work item
Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights. Commit: a6bc526 |

No description provided.