Skip to content

Conversation

@rose2221
Copy link
Collaborator

@rose2221 rose2221 commented Feb 12, 2026

This PR implements two major witness-count optimizations for the R1CS compiler targeting SHA256 circuits:

1. Dynamic Bit-Width for Combined AND/XOR Lookup Table

Introduces a cost model that searches over candidate atomic widths {2, 4, 8} and selects the width minimizing total witness count (table + decomposition + complementary + overhead)
Currently only applies to 8-bit operands; wider operands fall back to w=8

2. Spread Trick for SHA256 Bitwise Operations

Replaces the generic combined AND/XOR table approach for SHA256 with the spread-based technique from Eli Ben-Sasson et al.

Key idea: "spreading" a value by interleaving zeros between bits (0b10110b01000101) converts bitwise XOR/AND into field addition/multiplication, eliminating per-operation lookup costs

Why ROTR/SHR are free: SHA256 decomposes each 32-bit word into chunks aligned to rotation boundaries. For example, Sigma_0 uses chunks [2, 11, 9, 10] bits exactly matching ROTR2, ROTR13, ROTR22. A rotation is just reading the same chunks in a different order with adjusted coefficients: ROTR2(a) reads chunks starting from index 1 instead of index 0. No new witnesses are allocated — the spread values computed during decomposition are reused with permuted coefficients. Similarly, SHR drops the lowest chunks entirely. The cost of rotation/shift is zero witnesses, zero constraints.

SHA256 operations implemented via spread: sigma_0, sigma_1, Sigma_0, Sigma_1, Ch, Maj, message schedule, compression rounds

  • Uses a fixed 8-bit spread table (256 entries, 768 witnesses for LogUp)
  • Multi-operand u32 addition with single carry witness per addition

Stats

SHA256 (35 compression calls )

Metric Before After (Spread) Reduction
Constraints 911,769 445,739 -51.1%
Witnesses 1,588,306 1,033,576 -34.9%

SHA256 (1 compression call)

Metric Before Spread Dynamic AND/XOR
Constraints 86,607 14389 44,506
Witnesses 231,044 31680 68,251

Known Soundness Limitation (for M31)

Future Work

Dynamic spread table width: Currently fixed at w=8 (256 entries). A dynamic search over {4, 8, 16} could further reduce witnesses for circuits with fewer SHA256 calls

Shared lookup table: The spread table implicitly range-checks values to [0, 2^w - 1]. Merging with the existing range-check system would eliminate redundant table entries and save additional witnesses.

… display

- Added a new module for SHA256 chunk decompositions with spread-based representation in .
- Introduced functions for computing spread, decomposing packed witnesses, and adding spread table constraints.
- Updated the  function in  to reflect the new spread-based SHA256 cost summary.
- Removed redundant calculations for SHA256 batched constraints and witnesses, simplifying the output.
- Enhanced the display of optimal atomic width for binary operations in the circuit stats.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant