Skip to content

iso-bpe E8 vs scalar KV quant on Mistral-7B (n=60); 24-cell baseline not a valid HQMQ comparison#33

Draft
jagmarques wants to merge 4 commits into
mainfrom
company/iso-bpe-e8-vs-24cell
Draft

iso-bpe E8 vs scalar KV quant on Mistral-7B (n=60); 24-cell baseline not a valid HQMQ comparison#33
jagmarques wants to merge 4 commits into
mainfrom
company/iso-bpe-e8-vs-24cell

Conversation

@jagmarques

@jagmarques jagmarques commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Iso-bpe paired-PPL comparison of KV quantizers on Mistral-7B-v0.1, n=60 segments, Kaggle T4x2.

Clean finding (defensible): calibration-free E8 lattice substantially beats per-coordinate uniform scalar at aggressive budgets. At ~1.125 bpe E8 is +23.1% PPL vs scalar +100.7% (E8 wins 60/60 segments); at ~2.125 bpe they near-tie (E8 +1.34% vs scalar +1.67%). This corroborates the known result that E8 earns its place at low bit budgets.

NOT a valid HQMQ / Hurwitz comparison. The kernel's 24-cell path is a naive single 24-vertex direction snap, not the published HQMQ, which uses a product direction code (24 Hurwitz vertices times a secondary codebook = 24S codewords) plus median-multiplier outlier extraction, evaluated at 3.79-5 bpe. Our baseline is feature-stripped and below that budget band, so no E8-vs-HQMQ claim is made. A faithful HQMQ comparison at its published bpe is left as an open task.

Built and CPU-smoke-tested; GPU result above is from the secondary Kaggle account. Draft pending the scalar-finding review.

Kernel: experiments/kaggle/nq_iso_e8_24cell/
- Tests the Hurwitz claim that 24-cell outperforms E8 at matched bpe.
- Three calibration-free quantizers (same Hadamard + RoPE pipeline):
    E8 (8-dim blocks), 24-cell Hurwitz quaternion (4-dim blocks), scalar.
- Explicit bpe accounting per project rule:
    E8 1-bit=1.125, E8 2-bit=2.125, 24-cell 1-round=1.2712,
    24-cell 2-round=2.4175, Scalar 2-lev=1.125, Scalar 4-lev=2.125.
- 24-cell given MORE bits at every comparison point (fairest test).
- AQUA-iso paired-chunk PPL protocol on wikitext-2-raw-v1 (NF4 weights).
- CPU smoke test exits 0 (verified python3.11).
- GPU run queued on next Kaggle T4x2 quota window.
cell24_nearest_point was scaling each 4-dim block back by its L2 norm
(nearest_unit * norms), giving the 24-cell 32 free fp32 magnitudes per
128-dim head. Those magnitudes were never charged in _bpe_info, silently
biasing distortion down vs E8.

Fix: return only the nearest unit vertex; the caller's per-head scale sc
handles dequant, exactly matching E8 accounting. Both quantizers now carry
one fp16 scale per head (0.125 bpe) + lattice indices, and nothing else.

Smoke test (exit 0) confirms the fix is non-vacuous. New MSE shows 24-cell
losing to E8 even at its higher bpe budget:
  Low budget  (E8 1.125 bpe vs 24cell 1.271 bpe): E8 MSE=0.5947 < 24cell MSE=0.6302
  Med budget  (E8 2.125 bpe vs 24cell 2.417 bpe): E8 MSE=0.1450 < 24cell MSE=0.6206
…l-7B-v0.1

NousResearch/Mistral-7B-v0.1 does not exist on HF; mistralai/Mistral-7B-v0.1 is
confirmed ungated (gated=False via HF API) and downloads anonymously on jagmardrop.
…ode)

Two bugs fixed per critic review:

(a) Direction-only was wrong: old code returned the nearest unit Hurwitz vertex
    and discarded the per-block magnitude entirely, making it a direction-only
    code, not HQMQ. Now separates r=||x|| (magnitude, br uniform bits per block)
    and u=x/r (direction, nearest of 24 Hurwitz quaternions), reconstructing
    x_hat = r_q * u_q per Swain et al. arXiv:2605.27646 Section 3.

(b) Broken 2-round residual dropped: the residual after round-1 has smaller norm
    than a unit vertex, so adding another unit-norm codeword overshoots (59/60
    segments worse). HQMQ is single-shot by construction; dropped the loop.

bpe now honestly charges magnitude bits:
  HQMQ_br2: (log2(24)+2)/4 + 16/128 = 1.771 bpe
  HQMQ_br3: (log2(24)+3)/4 + 16/128 = 2.021 bpe
Both sit below the E8 budget point directly above them, so E8 still wins if it
beats HQMQ on PPL at higher budget.

Smoke test updated: checks magnitude preservation, confirms old residual bug,
7/7 checks pass (exit 0).
@jagmarques jagmarques changed the title C15(b) kernel: E8 vs 24-cell (Hurwitz HQMQ) vs scalar at iso-bpe iso-bpe E8 vs scalar KV quant on Mistral-7B (n=60); 24-cell baseline not a valid HQMQ comparison Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant