To clone this repository along with all submodules, run:
git clone --recursive https://github.com/rguntz/ibrl-docker.git
Download data and BC models
Download dataset and models from Google Drive and put the folders under release folder.
The release folder should contain release/cfgs (already shipped with the repo), release/data and release/model (the latter two are from the downloaded zip file).
unzip the file :
unzip data_and_model.zipBuild the image :
docker build -t ibrl-gpu .docker run --gpus all -it --rm \
-v $(pwd):/app \
ibrl-gpuxhost +local:dockerdocker run --gpus all -it --rm \
-e DISPLAY=$DISPLAY \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v $(pwd):/app \
ibrl-gpuCompile the CPP files :
cd common_utils
mkdir build
cd build
cmake ..
make -jVisit the site https://docs.wandb.ai/models/quickstart and create an account, and then an API key to monitor online the training pipeline. Then, run the commmand in inside the terminal :
export WANDB_API_KEY="837e47625b8acbd29a5dbc4cc5e0dcc30fca5f73"Train RL policy using the BC policy provided in release folder
# can
python train_rl.py --config_path release/cfgs/robomimic_rl/can_ibrl.yaml
# square
python train_rl.py --config_path release/cfgs/robomimic_rl/square_ibrl.yamlUse --save_dir PATH to specify where to store the logs and models.
Use --use_wb 0 to disable logging to weight and bias.
Use the following commands to train a BC policy from scratch. We find that IBRL is not sensitive to the exact performance of the BC policy.
# can
python train_bc.py --config_path release/cfgs/robomimic_bc/can.yaml
# square
python train_bc.py --config_path release/cfgs/robomimic_bc/square.yaml# can
python train_rl.py --config_path release/cfgs/robomimic_rl/can_rlpd.yaml
# square
python train_rl.py --config_path release/cfgs/robomimic_rl/square_rlpd.yamlThese commands run RFT from pretrained models in release folder.
# can rft
python train_rl.py --config_path release/cfgs/robomimic_rl/can_rft.yaml
# square rft
python train_rl.py --config_path release/cfgs/robomimic_rl/square_rft.yamlTo only perform pretraining:
# can, pretraining for 5 x 10,000 steps
python train_rl.py --config_path release/cfgs/robomimic_rl/can_rft.yaml --pretrain_only 1 --pretrain_num_epoch 5 --load_pretrained_agent None
# square, pretraining for 10 x 10,000 steps
python train_rl.py --config_path release/cfgs/robomimic_rl/square_rft.yaml --pretrain_only 1 --pretrain_num_epoch 10 --load_pretrained_agent NoneTrain IBRL using the provided state BC policies:
# can state
python train_rl.py --config_path release/cfgs/robomimic_rl/can_state_ibrl.yaml
# square state
python train_rl.py --config_path release/cfgs/robomimic_rl/square_state_ibrl.yamlTo train a state BC policy from scratch:
# can
python train_bc.py --config_path release/cfgs/robomimic_bc/can_state.yaml
# square
python train_bc.py --config_path release/cfgs/robomimic_bc/square_state.yaml# can state
python train_rl.py --config_path release/cfgs/robomimic_rl/can_state_rlpd.yaml
# square state
python train_rl.py --config_path release/cfgs/robomimic_rl/square_state_rlpd.yamlSince state policies are fast to train, we can just run pretrain and RL fine-tuning in one step.
# can
python train_rl.py --config_path release/cfgs/robomimic_rl/can_state_rft.yaml
# square
python train_rl.py --config_path release/cfgs/robomimic_rl/square_state_rft.yamlTrain RL policy using the BC policy provided in release folder
# assembly
python mw_main/train_rl_mw.py --config_path release/cfgs/metaworld/ibrl_basic.yaml --bc_policy assembly
# boxclose
python mw_main/train_rl_mw.py --config_path release/cfgs/metaworld/ibrl_basic.yaml --bc_policy boxclose
# coffeepush
python mw_main/train_rl_mw.py --config_path release/cfgs/metaworld/ibrl_basic.yaml --bc_policy coffeepush
# stickpull
python mw_main/train_rl_mw.py --config_path release/cfgs/metaworld/ibrl_basic.yaml --bc_policy stickpullIf you want to train BC policy from scratch
python mw_main/train_bc_mw.py --dataset.path Assembly --save_dir SAVE_DIRNote that we still specify bc_policy to specify the task name, but we don't use it in baselines.
This is special to train_rl_mw.py.
python mw_main/train_rl_mw.py --config_path release/cfgs/metaworld/rlpd.yaml --bc_policy assembly --use_wb 0For simplicity, here this one command performs both pretraining and RL training.
python mw_main/train_rl_mw.py --config_path release/cfgs/metaworld/rft.yaml --bc_policy assembly --use_wb 0@misc{hu2023imitation,
title={Imitation Bootstrapped Reinforcement Learning},
author={Hengyuan Hu and Suvir Mirchandani and Dorsa Sadigh},
year={2023},
eprint={2311.02198},
archivePrefix={arXiv},
primaryClass={cs.LG}
}