Skip to content

Med3x sub-2-bit NIAH rescue test on Yi-6B-Chat#38

Draft
jagmarques wants to merge 1 commit into
mainfrom
company/e8-med3x-yi
Draft

Med3x sub-2-bit NIAH rescue test on Yi-6B-Chat#38
jagmarques wants to merge 1 commit into
mainfrom
company/e8-med3x-yi

Conversation

@jagmarques

Copy link
Copy Markdown
Owner

What

New Kaggle kernel testing whether Med3x outlier extraction rescues
sub-2-bit NIAH collapse on Yi-6B-Chat. Prior run: K2V1=6/40, K1V2=0/40,
K1V1=1/40. Hypothesis: magnitude outlier token-rows (norm > 3x median)
stored as fp16 restore recall.

Configs (nominal bpe before runtime overhead)

  • FP16 baseline (must be non-degenerate at 4K)
  • E8-K2V1 (1.8125 bpe) -- reproduces sub-2-bit collapse
  • E8-K2V1 + Med3x -- rescue test
  • E8-K1V2 (1.8125 bpe) -- worst collapse in prior run
  • E8-K1V2 + Med3x -- rescue test
  • E8-K3V2 (1.9375 bpe) -- near-lossless reference

Med3x bpe charged honestly: b+ = (1-p)b_E8 + p16 + 0.25 (p measured at runtime).

Protocol

  • NIAH: 4K context, chat-template, n=20 trials per depth, verbatim value match
  • PPL: paired-chunk wikitext-2 n>=40 segments, NF4 weights

Status

GPU result pending. No result claimed.

Proof of work

smoke test: exit 0
(a) Med3x outlier detection ... OK (row 22 flagged; fp16 rel_err=2.75e-04)
(b) Sign-flip invertible ... OK (max_diff=0.00e+00)
(c) NIAH needle recall (verbatim substring, not key trap) ... OK
(d) bpe formulas for K2V1, K1V2, K3V2 ... OK (K2V1=1.8125, K1V2=1.8125, K3V2=1.9375 bpe)
(e) E8 round-trip ... SKIP (nexusquant not installed locally)
All smoke tests PASSED.

kaggle kernels push: Kernel version 1 successfully pushed.
URL: https://www.kaggle.com/code/jooandrgomesmarques/nq-e8-med3x-yi

git diff --stat: 3 files changed, 852 insertions(+)
branch: company/e8-med3x-yi sha: 23dc020

Tests whether Med3x outlier extraction (C=3) can rescue the sub-2-bit
NIAH collapse on Yi-6B-Chat (K2V1=6/40, K1V2=0/40 in prior run).
Configs: FP16, E8-K2V1, E8-K2V1+Med3x, E8-K1V2, E8-K1V2+Med3x, E8-K3V2.
NIAH at 4K context with chat-template, n=20 trials; paired PPL n>=40.
Smoke test passes (exit 0). GPU result pending.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant