Skip to content

Record: Sliding Window + FP16 Embed + 10L + Muon WD + Overtone Init (val_bpb=1.1748)#60

Merged
0hq merged 6 commits intoopenai:mainfrom
notapplica:submission/ntk-eval-overtone-init
Mar 19, 2026
Merged

Record: Sliding Window + FP16 Embed + 10L + Muon WD + Overtone Init (val_bpb=1.1748)#60
0hq merged 6 commits intoopenai:mainfrom
notapplica:submission/ntk-eval-overtone-init

Conversation

@notapplica
Copy link
Contributor

@notapplica notapplica commented Mar 19, 2026

Summary

Mean val_bpb: 1.1748 (3 seeds, p<0.001)

Stacks 6 orthogonal improvements over the baseline:

  1. Sliding window evaluation (stride=64, seq_len=1024) — every token scored with 960+ context
  2. FP16 tied embedding export — skip int8 quantization for tok_emb (errors compound in both input/output paths)
  3. 10 transformer layers (up from 9) — Muon weight decay compresses enough to fit the extra layer
  4. Decoupled weight decay for Muon optimizer (0.02) — Muon has no built-in regularization; adding p.mul_(1 - wd * lr) improves generalization + quantization
  5. Overtone spectral embedding init — SVD power-law spectrum shaping
  6. Phase-transition residual mixing — sigmoid-scheduled resid_mix initialization
Seed val_loss val_bpb Steps ms/step
1337 1.9849 1.1756 10424 57.55
42 1.9827 1.1742 10710 56.06
7 1.9830 1.1744 10498 57.18
Mean 1.9835 1.1748

Artifact: ~14.7 MB (under 16 MB limit)

notapplica and others added 6 commits March 18, 2026 23:51
Train@1024 with overtone embedding init and phase-transition residual
mixing, eval@2048 with NTK-aware dynamic RoPE scaling. Mean val_bpb
1.2160 across 3 seeds (p=0.0012 for 0.0194-nat improvement over baseline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@notapplica notapplica changed the title NTK Eval + Overtone Init (val_bpb=1.2160) Record: Sliding Window + FP16 Embed + 10L + Muon WD + Overtone Init (val_bpb=1.1748) Mar 19, 2026
Copy link
Collaborator

@0hq 0hq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@0hq 0hq merged commit 9fbdf8c into openai:main Mar 19, 2026
@FI-Mihej
Copy link

@0hq , looks moltbot-ty to me. Just look to issues opened by it:

@notapplica
Copy link
Contributor Author

notapplica commented Mar 20, 2026

#138 is me lolol
Not moltbot but somewhat automated (i steer) (:
I have one claude working on the challenge and one claude analyzing everything in public

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants