GitHub - XiangshengGu/ActionVLM: This project explores the use of large foundational vision-language models in reinforcement learning, where the models function as agents, reward functions, or reward function code generators in unseen environments given a state and a goal.

Foundation models across various modalities have enjoyed unparalleled improvements in generalization over the past few years. These gains can be attributed to a variety of factors including but not limited to web-scale data, increases in parameter count, and training techniques such as instruction tuning. Despite these improvements, reinforcement learning techniques have not yet managed to achieve similar generalization across multiple environments without fine-tuning. In this project, we explore whether the generalization inherent to foundational vision language models can be applied to various reinforcement learning environments. Our work aims to determine whether the ability of large foundational vision-language models to generalize beyond their training data can be extended to reinforcement learning by having the models act as agents, reward functions, or reward function code generators in unseen environments given a state and a goal.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ActionVLM_Paper.pdf		ActionVLM_Paper.pdf
LICENSE		LICENSE
README.md		README.md
code_implementations.zip		code_implementations.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages