ParaGon: Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement
Project Page: 1989ryan.github.io/projects/paragon.html
This repository contains the pytorch implementation of the paper: Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement.
You are highly recommended to use Docker to run the code.
Install nvidia-docker
Build docker container
python3 scripts/docker_build.pyRun docker container
python3 scripts/docker_run.pyYou will need to have 269G free space to get all the data.
python3 scripts/get_dataset.pyYou can also choose to modify the script scripts/get_dataset.py to download testing data only (44G) if you do not have enough space.
python3 scripts/pretrain_model.pybash scripts/run_pretrain.shbash scripts/train.shbash scripts/eval.shIf you find this work useful in your research, please cite:
@InProceedings{zhao2023paragon,
author = {Zhao, Zirui and Lee, Wee Sun and Hsu, David},
title = {Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement},
booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation},
year = {2023}
}