Skip to content

Discriminative diverse-text rescue test (Mistral-7B 4K, FP16-failing operating point, n=40)#31

Draft
jagmarques wants to merge 2 commits into
mainfrom
company/gap-mistral-diverse-rescue
Draft

Discriminative diverse-text rescue test (Mistral-7B 4K, FP16-failing operating point, n=40)#31
jagmarques wants to merge 2 commits into
mainfrom
company/gap-mistral-diverse-rescue

Conversation

@jagmarques

Copy link
Copy Markdown
Owner

Summary

  • Adds experiments/kaggle/nq_mistral_diverse_rescue/ (2 files): kernel .py + kernel-metadata.json
  • Discriminative NIAH harness using completion prompt on wikitext haystack (no chat template) -- the operating point where FP16 partially fails at 4K context
  • Tests FP16, K2V2_pb0, K3V2_pb0 on Mistral-7B-Instruct-v0.3, n=40 trials, ctx=4096
  • Primary model: NousResearch/Mistral-7B-Instruct-v0.3 (ungated mirror, no token needed)
  • Quant code (quantize_kv / get_kv / set_kv / to_dyn / device handling) is byte-identical to nq_yi_rescue.py (proven reference); random.Random(SEED) reseeded per config for label-independent needle pairing
  • Outputs non_discriminative error key if FP16 hits 0 or 40 (saturation/degeneration detected)

Proof of work

VERIFY-WITH: kaggle kernels push -p experiments/kaggle/nq_mistral_diverse_rescue

OUTPUT:
Kernel push error: Maximum weekly GPU quota of 30.00 hours reached.

BLOCKED-KAGGLE: GPU quota exhausted (30h/week ceiling hit by earlier kernels today).
Files are built, ast-parsed (AST OK), secret-scanned (SCANNER: gitleaks - clean, exit 0).
Kernel slug: jagmarques/nq-mistral-diverse-rescue (ready to push when quota resets Monday).

AST check: python3.11 -c "import ast; ast.parse(open('nq_mistral_diverse_rescue.py').read())" -> AST OK
Secret scan: SCANNER: gitleaks - clean, exit 0

Diff stat:
 experiments/kaggle/nq_mistral_diverse_rescue/kernel-metadata.json    |  20 +
 experiments/kaggle/nq_mistral_diverse_rescue/nq_mistral_diverse_rescue.py | 466 +++++++++++++++++++++
 2 files changed, 486 insertions(+)

… template)

Discriminative operating-point harness: completion prompt on wikitext haystack
creates partial FP16 failure band. Power-tests K2V2/K3V2 pb=0 rescue at n=40.
…3 primary

NousResearch/Mistral-7B-Instruct-v0.3 does not exist; primary is now
unsloth/mistral-7b-instruct-v0.3 (ungated, files confirmed via HF API).
Fallback is mistralai/Mistral-7B-Instruct-v0.3 (also ungated). HF_TOKEN
is now optional only (used if present; neither model requires it).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant