Reinforcement-learning locomotion stack for the Deep Robotics Lite3 quadruped. End-to-end: IsaacLab PPO training (flat, rough, or bumpy heightfield envs), sim-to-sim validation in MuJoCo / PyBullet / Newton (with bumpy heightfield ground or optional granular sand terrain), and sim-to-real deployment over WiFi.
.
├── src/ # C++ controller source
│ ├── main.cpp
│ ├── state_machine/ # FSM: Idle -> StandUp -> RL -> JointDamping
│ ├── interface/ # Robot (sim/hardware) and input (keyboard/gamepad)
│ └── *policy_runner* # ONNX inference wrapper
├── simulation/ # Python sim-to-sim
│ ├── mujoco_simulation.py
│ ├── pybullet_simulation.py
│ └── newton_simulation.py
├── training/ # IsaacLab training package
│ ├── lite3_training/ # Gym envs, MDP terms, PPO config
│ └── scripts/ # train.py, play.py, list_envs.py, view_robot.py
├── models/
│ ├── pretrained/ # PPO policy (ONNX) + pt-to-onnx converter
│ └── description/ # MJCF, USD, meshes
├── tests/ # pytest suite (policy, MJCF, Newton smoke)
├── deploy_scripts/ # SCP/SFTP scripts for robot deployment
├── third_party/ # Eigen, ONNX Runtime, MuJoCo, Gamepad SDK, Motion SDK
├── artefacts.yaml # Test orchestration spec for the artefacts CLI
├── docs/ # Per-topic guides
└── CMakeLists.txt
The fastest path is to spin up Newton sim-to-sim against the bundled pretrained policy:
# Bring in deps (see docs/installation.md for the full uv-based path)
pip install pybullet "numpy<2.0" "mujoco==3.6.0" colorama onnxruntime
pip install "newton==1.0.0" "mujoco-warp==3.6.0" "warp-lang==1.12.1" "pyglet>=2.0"
# Run the pretrained policy in Newton (flat ground, viewer)
python simulation/newton_simulation.py
# Press 'c' in the terminal to engage the policy, then w/s/a/d/q/e to drive
# Or stress-test it on the bumpy heightfield ground (small-scale features +
# randomized per-patch friction; see docs/bumpy-terrain.md)
python simulation/newton_simulation.py --scene bumpy --bumpy-seed 42For the full sim-to-real workflow (build the C++ controller and connect to MuJoCo or the robot):
sudo apt install libdw-dev
mkdir build && cd build
cmake .. -DBUILD_PLATFORM=x86 -DBUILD_SIM=ON -DSEND_REMOTE=OFF
make -j
# Terminal 1: simulator
python simulation/mujoco_simulation.py
# Terminal 2: controller
./rl_deploy| What you want | Read |
|---|---|
| Get the toolchain installed end-to-end | docs/installation.md |
| Train a new policy in IsaacLab | docs/training.md |
| Run sim-to-sim with the Newton engine | docs/simulation-newton.md |
| Train or simulate on bumpy natural-ground terrain (recommended) | docs/bumpy-terrain.md |
| Walk over a sand bed (implicit MPM, optional) | docs/granular-terrain.md |
| Run sim-to-sim with MuJoCo or PyBullet | docs/simulation-mujoco.md |
| Push a policy to a real Lite3 over WiFi | docs/deployment.md |
| Run the test suite (pytest or artefacts CLI) | docs/tests.md |
| Diagnose an error | docs/troubleshooting.md |
The pretrained policy ships at models/pretrained/policy.onnx (PPO, 744 KB).
- Observation (45-dim): angular velocity (3), projected gravity (3), velocity command (3), joint positions (12), joint velocities (12), last action (12)
- Action (12-dim): target joint positions for all 4 legs (HipX, HipY, Knee per leg)
- Proprioceptive only, no vision or contact sensors
- Trained on randomized terrain so it handles stairs, slopes, and rough ground through reactive gait adaptation
Convert a freshly-trained .pt to ONNX with models/pretrained/pt2onnx.py. See docs/training.md.
Built on top of Deep Robotics open-source resources (BSD-3 license). Training pipeline derived from DeepRoboticsLab/rl_training. Granular-terrain integration follows the pattern in newton/examples/mpm/example_mpm_anymal.py.