iso-bpe E8 vs scalar KV quant on Mistral-7B (n=60); 24-cell baseline not a valid HQMQ comparison by jagmarques · Pull Request #33 · jagmarques/nexusquant

jagmarques · 2026-06-15T17:52:07Z

Iso-bpe paired-PPL comparison of KV quantizers on Mistral-7B-v0.1, n=60 segments, Kaggle T4x2.

Clean finding (defensible): calibration-free E8 lattice substantially beats per-coordinate uniform scalar at aggressive budgets. At ~1.125 bpe E8 is +23.1% PPL vs scalar +100.7% (E8 wins 60/60 segments); at ~2.125 bpe they near-tie (E8 +1.34% vs scalar +1.67%). This corroborates the known result that E8 earns its place at low bit budgets.

NOT a valid HQMQ / Hurwitz comparison. The kernel's 24-cell path is a naive single 24-vertex direction snap, not the published HQMQ, which uses a product direction code (24 Hurwitz vertices times a secondary codebook = 24S codewords) plus median-multiplier outlier extraction, evaluated at 3.79-5 bpe. Our baseline is feature-stripped and below that budget band, so no E8-vs-HQMQ claim is made. A faithful HQMQ comparison at its published bpe is left as an open task.

Built and CPU-smoke-tested; GPU result above is from the secondary Kaggle account. Draft pending the scalar-finding review.

Kernel: experiments/kaggle/nq_iso_e8_24cell/ - Tests the Hurwitz claim that 24-cell outperforms E8 at matched bpe. - Three calibration-free quantizers (same Hadamard + RoPE pipeline): E8 (8-dim blocks), 24-cell Hurwitz quaternion (4-dim blocks), scalar. - Explicit bpe accounting per project rule: E8 1-bit=1.125, E8 2-bit=2.125, 24-cell 1-round=1.2712, 24-cell 2-round=2.4175, Scalar 2-lev=1.125, Scalar 4-lev=2.125. - 24-cell given MORE bits at every comparison point (fairest test). - AQUA-iso paired-chunk PPL protocol on wikitext-2-raw-v1 (NF4 weights). - CPU smoke test exits 0 (verified python3.11). - GPU run queued on next Kaggle T4x2 quota window.

cell24_nearest_point was scaling each 4-dim block back by its L2 norm (nearest_unit * norms), giving the 24-cell 32 free fp32 magnitudes per 128-dim head. Those magnitudes were never charged in _bpe_info, silently biasing distortion down vs E8. Fix: return only the nearest unit vertex; the caller's per-head scale sc handles dequant, exactly matching E8 accounting. Both quantizers now carry one fp16 scale per head (0.125 bpe) + lattice indices, and nothing else. Smoke test (exit 0) confirms the fix is non-vacuous. New MSE shows 24-cell losing to E8 even at its higher bpe budget: Low budget (E8 1.125 bpe vs 24cell 1.271 bpe): E8 MSE=0.5947 < 24cell MSE=0.6302 Med budget (E8 2.125 bpe vs 24cell 2.417 bpe): E8 MSE=0.1450 < 24cell MSE=0.6206

…l-7B-v0.1 NousResearch/Mistral-7B-v0.1 does not exist on HF; mistralai/Mistral-7B-v0.1 is confirmed ungated (gated=False via HF API) and downloads anonymously on jagmardrop.

…ode) Two bugs fixed per critic review: (a) Direction-only was wrong: old code returned the nearest unit Hurwitz vertex and discarded the per-block magnitude entirely, making it a direction-only code, not HQMQ. Now separates r=||x|| (magnitude, br uniform bits per block) and u=x/r (direction, nearest of 24 Hurwitz quaternions), reconstructing x_hat = r_q * u_q per Swain et al. arXiv:2605.27646 Section 3. (b) Broken 2-round residual dropped: the residual after round-1 has smaller norm than a unit vertex, so adding another unit-norm codeword overshoots (59/60 segments worse). HQMQ is single-shot by construction; dropped the loop. bpe now honestly charges magnitude bits: HQMQ_br2: (log2(24)+2)/4 + 16/128 = 1.771 bpe HQMQ_br3: (log2(24)+3)/4 + 16/128 = 2.021 bpe Both sit below the E8 budget point directly above them, so E8 still wins if it beats HQMQ on PPL at higher budget. Smoke test updated: checks magnitude preservation, confirms old residual bug, 7/7 checks pass (exit 0).

jagmarques added 4 commits June 15, 2026 19:51

Fix model id: switch NousResearch/Mistral-7B-v0.1 to mistralai/Mistra…

6f1ff8d

…l-7B-v0.1 NousResearch/Mistral-7B-v0.1 does not exist on HF; mistralai/Mistral-7B-v0.1 is confirmed ungated (gated=False via HF API) and downloads anonymously on jagmardrop.

jagmarques changed the title ~~C15(b) kernel: E8 vs 24-cell (Hurwitz HQMQ) vs scalar at iso-bpe~~ iso-bpe E8 vs scalar KV quant on Mistral-7B (n=60); 24-cell baseline not a valid HQMQ comparison Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iso-bpe E8 vs scalar KV quant on Mistral-7B (n=60); 24-cell baseline not a valid HQMQ comparison#33

iso-bpe E8 vs scalar KV quant on Mistral-7B (n=60); 24-cell baseline not a valid HQMQ comparison#33
jagmarques wants to merge 4 commits into
mainfrom
company/iso-bpe-e8-vs-24cell

jagmarques commented Jun 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jagmarques commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jagmarques commented Jun 15, 2026 •

edited

Loading