Skip to content

megaknight114/atari-world-model-inputs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

World-Model Inputs for Atari Policies

This repository is a curated code and results snapshot for the CV project "World-Model Inputs for Atari Policies."

The project tested whether an Atari policy can perform better when it receives additional, detached summary predictions from a world model trained alongside the policy, in addition to raw game observations. The main experiment covers 26 Atari100K games with 2 seeds each.

Current Result Snapshot

Latest synchronized local report: 2026-05-06 14:05:46 +08:00.

  • Main clean100k run: 52 / 52 tasks completed.
  • Mean HNS over 26 games: 1.875, compared with 1.818 for the EAWM Table 1 reference.
  • Median HNS over 26 games: 0.959, compared with 0.773 for the EAWM Table 1 reference.
  • Mean per-game raw-score change versus the reference: -0.01%.
  • Interpretation: the average result is close to the baseline, but the effect is highly game-dependent. Some games improve substantially, while others degrade substantially, so this snapshot should not be read as a stable positive result.

Primary report:

Repository Layout

  • third_party/EAWM/: curated copy of the EAWM code paths used for the experiments.
  • runs/: local and Spartan launch/config utilities.
  • manifests/: experiment manifests for clean100k, pooling, gated, and concat-gated variants.
  • spartan_tools/: scripts used to scan Spartan outputs and build the Focus82 report.
  • results/focus82/: synchronized report files, task table, and per-game comparison table.
  • docs/CV_PROJECT_SUMMARY.md: short project summary aligned with the CV description.
  • docs/SPARTAN_SYNC_STATUS.md: provenance and sync status for the included result snapshot.

Provenance

The implementation builds on the EAWM codebase and local Spartan patches used for the Atari100K world-model-summary experiments. The included results are lightweight reports derived from Spartan output scans, not full checkpoints or raw training logs.

Large artifacts such as checkpoints, W&B media, raw output directories, and visualization GIFs are intentionally excluded from this curated public snapshot.

About

Curated code and result summary for world-model inputs in Atari policy experiments.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors