feat(rv32): implement i64 SIMD polyfill via register-pair calling convention

## Context

The RISC-V backend currently has an open issue (#295) about i64.div_s/rem_s sign computation being clobbered by the udiv cores fixed registers. Beyond that specific bug, the RV32 backend generally lacks i64 SIMD-level throughput because there is no polyfill path that maps WASM v128 (SIMD) operations onto the RV32P Zve32* or Zvl32* vector extensions.

## Proposal

Implement a compile-time `--rv32-simd=zve32x|zve32f|zvl32b` option that enables v128 lowering via subregister pairing:

| WASM op | RV32 lowering | Extension needed |
|---------|---------------|-----------------|
| `v128.load` / `v128.store` | 4× i32 scalar | Zve32x (base) |
| `i8x16.add` | 16× byte SIMD via `vadd.vv` EEW=8 | Zve32x |
| `i16x8.add` | 8× halfword via `vadd.vv` EEW=16 | Zve32x |
| `i32x4.add` | 4× word via `vadd.vv` EEW=32 | Zve32x |
| `i64x2.add` | 2× doubleword via `vadd.vv` EEW=64 | Zve64x (requires Zvl64b) |
| `f32x4.add` | 4× float via `vfadd.vv` | Zve32f |

When the target does not advertise Zve64x (common on RV32 embedded cores like the GD32V), i64x2 ops degrade to 2× scalar i64 register pairs — the same pattern already used for scalar i64 in the rv32 backend.

## Rationale

WASM SIMD is one of the most impactful proposals for embedded/edge workloads (sensor fusion, audio DSP, pixel ops). Synth targets Cortex-M and RV32 equally in its mission statement; adding RV32 SIMD lowering closes the functional gap with the ARM backend, which already has NEON-adjacent ops via VFP/ASIMD.

## Existing precedent in the codebase

- Synth already has `--relocatable` and `--safety-bounds` target options (#231, #251).
- The `arm` backend selects between Thumb, Thumb-2, and ARM ISA based on target triple. RISC-V should similarly select between RV32I, RV32IM, and Zve* based on `--rv32-simd=` or a target-feature string.
- Several open proof tickets (VCR-* epic) and the Rocq/Rust divergence (#110) already discuss adding new lowering rules with proof obligations. SIMD lowering could start with the non-proof path (like the current unproven i64/float path) and be proven later.

## Minimal first step

A single PR that:
1. Adds `RiscvSimdLevel` to the target configuration
2. Lowers `i8x16.splat`, `i8x16.extract_lane`, and `i8x16.replace_lane` to `vmv.v.x` / `vget.v` / `vset.v`
3. Wires these through the existing test harness and verifies against 5 SIMD spec tests

This is analogous to how the arm backend incrementally added f32 support.

WASM op	RV32 lowering	Extension needed
`v128.load` / `v128.store`	4× i32 scalar	Zve32x (base)
`i8x16.add`	16× byte SIMD via `vadd.vv` EEW=8	Zve32x
`i16x8.add`	8× halfword via `vadd.vv` EEW=16	Zve32x
`i32x4.add`	4× word via `vadd.vv` EEW=32	Zve32x
`i64x2.add`	2× doubleword via `vadd.vv` EEW=64	Zve64x (requires Zvl64b)
`f32x4.add`	4× float via `vfadd.vv`	Zve32f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rv32): implement i64 SIMD polyfill via register-pair calling convention #380

Context

Proposal

Rationale

Existing precedent in the codebase

Minimal first step

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(rv32): implement i64 SIMD polyfill via register-pair calling convention #380

Description

Context

Proposal

Rationale

Existing precedent in the codebase

Minimal first step

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions