GitHub - VTool-R1/training-v2: Updated training code of VTool-R1 implemented using asynchronous agent loops on verl.

VTool-R1 New Training Library

This library contains the new training and evaluation code for VTool-R1.

It is implemented using the latest version of verl, commit 498c988ab7af49aa36c157c9214ebbc780013d61, using VLLM 0.17

The library is implemented using the asynchronous agent loop, allowing for max GPU utilization for mult-turn tool use.

Dataset

All dataset available are on our hugging face page. You may find it here

Training

We implemented VTool-R1 in the form of a recipe, available in recipe/vtool

Training scripts for 3B and 7B are available

bash run_qwen2_5_vl_3b_chart.sh

bash run_qwen2_5_vl_7b_chart.sh

The scripts are configured for a single 4xH100 node.

As stated in our original paper, we require an additional LLM serving as the reward reward. We provide a much more efficient strategy for doing so.

By default, the training script will attempt to connect to your vllm serve endpoint on localhost:8000/v1, you may configure it to use another GPU node.

You may either take advantage of your CPUs to host the LLM judge, or partition a small chunk of your GPU memory to host the LLM judge.

In addition, we provide hybrid solutions in the form of the load_balance.sh. You can then run both your CPU judge, as well as the small GPU judge and route them through the load balancer. Our ratio was tested for optimal performance on a GH200 superchip node (4xH100).

Evaluation

Evaluation scripts are in the eval folder.

Please launch vllm_infer.sh to boot up the target model for evaluation using vllm serve. Then, use run_eval.sh to complete the evaluation rollout.

Then, you should use score_results.sh to score the results. You may use your own openai api key or GPT OSS 120B through vllm.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gemini		.gemini
.github		.github
docker		docker
docs		docs
eval		eval
examples		examples
recipe		recipe
scripts		scripts
tests		tests
verl		verl
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
export_model.sh		export_model.sh
focused_rows.jpg		focused_rows.jpg
pyproject.toml		pyproject.toml
requirements-cuda.txt		requirements-cuda.txt
requirements-npu.txt		requirements-npu.txt
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
requirements_sglang.txt		requirements_sglang.txt
result_image.jpg		result_image.jpg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VTool-R1 New Training Library

Dataset

Training

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VTool-R1 New Training Library

Dataset

Training

Evaluation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages