Skip to content

ZeroerWiser/boundless-world-model

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

66 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌍 Boundless-World-Model

BWM is a physically consistent, action-conditioned video world model built upon Wan2.2-TI2V-5B, serving as a low-cost yet high-fidelity simulator for robotic manipulation.

πŸ—žοΈ News

  • [2026-05] πŸš€ Inference code released! Generate action-conditioned robot manipulation videos with BWM. See πŸ› οΈ Usage.
  • [2026-05] πŸŽ‰ Model definition released! The BWM architecture and core model components are now available.

Table of Contents


βœ… TODO

  • Release inference code
  • Release model definition
  • Release model weights
  • Release training code
  • Release technical report

πŸ—οΈ Framework

Coming soon !


🎬 Qualitative Results

CVPR 2026 WorldArena Challenge

The following simulation scenes are generated autoregressively by BWM from initial frames and action sequences in the WorldArena test set, achieving high-fidelity visual realism while maintaining long-horizon physical consistency.

🧩 Scene 1: Compositional Spatial Rearrangement

blocks ranking size stack bowls three
  • Task: arrange blocks by size, stack bowls
  • Challenge: Multi-object spatial ordering, stacking stability, and contact-rich placement
  • Ours:
    • βœ… Preserves object identity and target layout
    • βœ… Maintains stable stacking contacts
    • βœ… Predicts adaptive gripper control

πŸšͺ Scene 2: Articulated Hinge Interaction

open microwave open laptop
  • Task: open microwave, open laptop
  • Challenge: Articulated hinge motion, constrained rotation, and persistent object state
  • Ours:
    • βœ… Captures hinge-constrained opening dynamics
    • βœ… Maintains coherent object geometry during rotation
    • βœ… Preserves opened states over long-horizon rollouts

πŸ•ΉοΈ Scene 3: Fine-Grained Affordance Interaction

turn switch hanging mug
click bell stamp seal
  • Task: turn switch, hang mug, click bell, stamp seal
  • Challenge: Small contact regions, constrained placement, and precise state-changing interactions
  • Ours:
    • βœ… Captures fine-grained affordance dynamics
    • βœ… Aligns contact with object affordances
    • βœ… Preserves state-changing interactions

🀝 Scene 4: Bimanual Coordination and Handover

handover block handover mic
  • Task: hand over block, hand over mic
  • Challenge: Dual-arm synchronization, inter-arm occlusion, and coordinated grasp timing
  • Ours:
    • βœ… Models synchronized dual-arm motion
    • βœ… Preserves object continuity
    • βœ… Avoids close-contact collisions

πŸ“¦ Scene 5: Long-Horizon Constrained Placement

put object cabinet put bottles dustbin
  • Task: put object in cabinet, put bottles in dustbin
  • Challenge: Long-horizon transport, partial occlusion, and constrained final placement
  • Ours:
    • βœ… Maintains long-horizon scene coherence
    • βœ… Handles occlusion without object drift
    • βœ… Produces stable constrained placement

Out-of-Distribution Generalization

To test generalization beyond benchmark initial states, we use GPT-Image-2-created initial scenes with original robot action sequences and let BWM autoregressively roll out the future under object appearance shifts.

ood episode100 ood episode100 variant 1 ood episode100 variant 3
ood episode33 ood episode33 variant 1 ood episode33 variant 5
  • Task: shake bottle, put object in cabinet
  • Challenge: Novel initial scenes and object appearance shifts
  • Ours:
    • βœ… Generalizes to GPT-Image-2-created initial scenes
    • βœ… Preserves action-conditioned dynamics
    • βœ… Maintains coherent robot-object interaction

πŸ› οΈ Usage

Quick Start: Video Generation Inference

Environment Setup

# Create conda environment
conda create -n BWM python=3.10.20
conda activate BWM

# Install PyTorch with CUDA support
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128

# Install DiffSynth-Studio
pip install diffsynth==2.0.11

# Install dependencies
pip install -r requirements.txt

Model Weights

Download the Wan2.2-TI2V-5B base model from ModelScope:

modelscope download --model Wan-AI/Wan2.2-TI2V-5B --local_dir models/Wan2.2-TI2V-5B

Download the BWM checkpoint from Hugging Face:

hf download BLM-Lab/Boundless-World-Model step-12000.safetensors --local-dir ckpt/BLM

Run Inference

The demo metadata, videos, actions, and normalization statistics are already included under demo/.

Set local paths before running inference:

cp scripts/local.example.sh scripts/local.sh

Update MODEL_PATHS and CKPT_PATH in scripts/local.sh, then run:

bash scripts/infer_example.sh

πŸ‹οΈ Training

Coming soon !


πŸ™ Acknowledgements

This project builds upon the following open-source projects and benchmarks. We thank these teams for their contributions:

We also acknowledge the following engineering contributions:

  • Wentao Tan: basic architecture design Β· Email Β· GitHub
  • Zengrong Lin: core code implementation Β· Email Β· GitHub
  • Yang Sun: code refactoring and software maintainability Β· Email Β· GitHub

πŸ“œ Citing

If you find BWM is useful in your research or applications, please consider giving us a star 🌟.


About

High-fidelity world models for general embodied intelligence, such as data engines and world simulators.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 94.4%
  • Shell 5.6%