Add Deep14x416 KV2 non-record MLX submission (val_bpb=1.8440) by cschubiner · Pull Request #56 · openai/parameter-golf

cschubiner · 2026-03-19T06:37:12Z

This PR adds a non-record unlimited-compute submission under records/track_non_record_16mb/.

The user-facing effect is a new reproducible Apple Silicon MLX result in the repository: a deeper/narrower SP-1024 model with 14 layers at width 416 and 2 KV heads, trained locally on an Apple M5 Max for 750 steps against a 10-shard FineWeb subset. The final post-quantized roundtrip metric recorded in the included log is val_bpb=1.84404368, with an int8+zlib model payload of 12,339,367 bytes and total submission size of 12,388,989 bytes.

The underlying motivation was to explore a simple parameter-budget trade: reduce width slightly, add depth, and use more aggressive KV sharing while staying well under the 16 MB artifact limit. This submission keeps the trainer straightforward by reusing the repository train_gpt_mlx.py snapshot exactly, and only changes the runtime configuration through environment variables. To make full validation tractable on local Apple Silicon hardware, the run also uses a larger validation batch and logit chunking; these settings affect execution efficiency, not the metric definition itself.

The root cause this PR addresses is not a bug in the repo but a gap in the records folder: there was no local Apple Silicon submission documenting this deeper/narrower 14x416 KV2 configuration and its measured result. The fix is therefore additive only: a new record folder containing the copied training script, the exact train log, a README with command/config details, and submission.json metadata.

Validation for this PR was done by actually running the training job to completion locally, then checking the copied script compiles with python -m py_compile. The included train.log contains the full training trace, the pre-quant validation result, the compressed model size, and the final final_int8_zlib_roundtrip_exact metric.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c7ab65cd40

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

records/track_non_record_16mb/2026-03-19_Deep14x416_KV2_SP1024_MLX_750it/README.md

cschubiner · 2026-03-19T06:47:17Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9ce10b899

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

records/track_non_record_16mb/2026-03-19_Deep14x416_KV2_SP1024_MLX_750it/README.md

cschubiner · 2026-03-19T06:53:57Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 91592141cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

records/track_non_record_16mb/2026-03-19_Deep14x416_KV2_SP1024_MLX_750it/README.md

cschubiner · 2026-03-19T07:30:41Z

Addressed the remaining reproducibility gap in e13e8db. The record now includes train_shards.txt with the exact 10 FineWeb train shards used for the run, and the README stages a local data_subset containing only those train shards plus the fixed validation shard before invoking train_gpt.py.

add deep14 mlx submission

c7ab65c

cschubiner marked this pull request as ready for review March 19, 2026 06:39

cschubiner changed the title ~~[codex] add deep14 mlx submission~~ Add Deep14x416 KV2 non-record MLX submission (val_bpb=1.8440) Mar 19, 2026

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

records/track_non_record_16mb/2026-03-19_Deep14x416_KV2_SP1024_MLX_750it/README.md Outdated Show resolved Hide resolved

fix record README invocation

f9ce10b

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

records/track_non_record_16mb/2026-03-19_Deep14x416_KV2_SP1024_MLX_750it/README.md Outdated Show resolved Hide resolved

add train_gpt entrypoint

9159214

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

records/track_non_record_16mb/2026-03-19_Deep14x416_KV2_SP1024_MLX_750it/README.md Outdated Show resolved Hide resolved

cschubiner mentioned this pull request Mar 19, 2026

Add mirror recurrence non-record submission #57

Closed

Pin exact train shards for Deep14x416 submission

e13e8db

0hq added the record submission ready for review label Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Deep14x416 KV2 non-record MLX submission (val_bpb=1.8440)#56

Add Deep14x416 KV2 non-record MLX submission (val_bpb=1.8440)#56
cschubiner wants to merge 4 commits intoopenai:mainfrom
cschubiner:codex/deep14-416-kv2-mlx-submission

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

cschubiner commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

cschubiner commented Mar 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

cschubiner commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants