Skip to content

Qwen3-8B RULER/multi-needle NIAH for Hurwitz head-to-head#34

Draft
jagmarques wants to merge 1 commit into
mainfrom
company/qwen3-ruler-h2h
Draft

Qwen3-8B RULER/multi-needle NIAH for Hurwitz head-to-head#34
jagmarques wants to merge 1 commit into
mainfrom
company/qwen3-ruler-h2h

Conversation

@jagmarques

Copy link
Copy Markdown
Owner

Qwen3-8B RULER/multi-needle head-to-head for Hurwitz comparability; built + pushed to secondary account; GPU result pending.

Kernel: jagmardrop/nq-qwen3-ruler-h2h at https://www.kaggle.com/code/jagmardrop/nq-qwen3-ruler-h2h

8 needles inserted at spread depths, one queried per trial, n=20 trials per config per context (4K; 8K if GPU budget allows). Configs: FP16, K3V2_pb0, K4V2_pb0. Value-specific recall regex to avoid key-substring false positives. Non-discriminative self-flag when FP16 saturates or hits zero.

Proof of work:

  • Smoke test: python3.11 smoke_test.py -> exit 0 (OK: 8 needles inserted, target key=2867825 val=704, recall regex correct)
  • Secret scan: node secret-scan.js --worktree <worktree> -> SCANNER: gitleaks - clean (exit 0)
  • Kaggle push: kaggle kernels push -> Kernel version 1 successfully pushed (https://www.kaggle.com/code/jagmardrop/nq-qwen3-ruler-h2h)
  • Diff stat: 3 files changed, 565 insertions(+)

8-needle RULER-style sweep at 4K (8K if GPU budget allows), n=20 trials,
FP16/K3V2_pb0/K4V2_pb0. Includes smoke test and discriminativeness flags.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant