RQZ-Golf v1: Depth recurrence for parameter efficiency by TheCause · Pull Request #54 · openai/parameter-golf

TheCause · 2026-03-19T06:31:46Z

Non-record experimental submission

Approach: Replace some unique layers with a single shared recurrent layer applied K times, saving parameters while increasing effective depth.

Architecture

7 unique layers (encoder/decoder with U-Net skip connections)
1 recurrent layer applied K=3 times with learned iteration embeddings
Effective depth: 10 layers (7 unique + 3 recurrent) vs baseline 9
Residual scaling by 1/sqrt(K) for stability

Key ideas

Depth recurrence: shared weights across K passes saves ~20% parameters
Iteration embeddings: per-pass learned vector (psi_k) for pass-awareness
Test-time compute: increase K at inference (K'>K) for better BPB without changing model size

Status

Preliminary baseline: 1.5283 BPB (1 shard, 1xA100)
RQZ-Golf architecture implemented, not yet benchmarked on full dataset
Requesting compute credits for full evaluation

Theoretical basis

Inspired by Universal Transformers (Dehghani 2019) and Deep Equilibrium Models (Bai 2019).

Non-record experimental submission. Architecture: 7 unique layers + 1 shared recurrent layer (K=3 passes) with iteration embeddings and 1/sqrt(K) scaling. Test-time compute: increase K at inference without changing model size.

Add RQZ-Golf v1: depth recurrence for parameter efficiency

944fc30

Non-record experimental submission. Architecture: 7 unique layers + 1 shared recurrent layer (K=3 passes) with iteration embeddings and 1/sqrt(K) scaling. Test-time compute: increase K at inference without changing model size.

0hq added the not ready for review label Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RQZ-Golf v1: Depth recurrence for parameter efficiency#54

RQZ-Golf v1: Depth recurrence for parameter efficiency#54
TheCause wants to merge 1 commit intoopenai:mainfrom
TheCause:rqz-golf-v1

TheCause commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TheCause commented Mar 19, 2026

Non-record experimental submission

Architecture

Key ideas

Status

Theoretical basis

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants