Dreamer-pixels lab: notebook execution + 3 asset figures#20
Merged
Conversation
Trimming the world model's training context so the lab finishes on CPU well under the 8-minute ceiling without losing the latent imagination story. https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7
The Dreamer-pixels reproduction lab finished its full pipeline: - 12-cell notebook runs ~5 min 10 s on CPU. - World-model pretraining ~70 s; full Dreamer 8-cycle loop ~232 s. - assets/reconstruction_grid.png shows the WM recovers cart/pole visual structure from the first cycle onward (slight ghosting only on frame 0 where h_0 = 0). - assets/latent_vs_real_rollout.png shows the imagination tracks the real env for ~5 steps then drifts; the pixel-MSE subplot exhibits the ~2x jump that puts a hard cap on imagination depth. - assets/return_vs_steps.png hovers near the random baseline (~25 vs ~20). README documents this honestly: at the chosen CPU budget the imagination horizon is too short to credit-assign a balance policy, but the architecture is faithful to DreamerV1 (encoder + RSSM with det h + stochastic Gaussian z + decoder + reward + continue heads, KL with balancing alpha=0.8, lambda-returns in latent imagination). https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Dreamer-pixels reproduction lab final run.
End-to-end notebook ~5 min 10 s on CPU. World-model pretraining ~70 s; full 8-cycle Dreamer loop ~232 s. Three assets generated:
reconstruction_grid.png— WM recovers cart/pole visual structure from cycle 1 (slight ghosting on frame 0 whereh_0 = 0).latent_vs_real_rollout.png— imagination tracks the real env for ~5 steps then drifts; the pixel-MSE subplot exhibits the ~2× jump that bounds the trustworthy imagination horizon.return_vs_steps.png— return hovers near random baseline (~25 vs ~20). README documents this honestly: at the chosen CPU budget the imagination horizon is too short to credit-assign a balance policy, but the architecture is faithful to DreamerV1 (encoder + RSSM with deterministich+ stochastic Gaussianz+ decoder + reward + continue heads, KL with balancing α = 0.8, λ-returns in latent imagination).https://claude.ai/code/session_017Ez7KNKDCGRRLjEnJi9TW7
Generated by Claude Code