Skip to content

XiangshengGu/ActionVLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Foundation models across various modalities have enjoyed unparalleled improvements in generalization over the past few years. These gains can be attributed to a variety of factors including but not limited to web-scale data, increases in parameter count, and training techniques such as instruction tuning. Despite these improvements, reinforcement learning techniques have not yet managed to achieve similar generalization across multiple environments without fine-tuning. In this project, we explore whether the generalization inherent to foundational vision language models can be applied to various reinforcement learning environments. Our work aims to determine whether the ability of large foundational vision-language models to generalize beyond their training data can be extended to reinforcement learning by having the models act as agents, reward functions, or reward function code generators in unseen environments given a state and a goal.

About

This project explores the use of large foundational vision-language models in reinforcement learning, where the models function as agents, reward functions, or reward function code generators in unseen environments given a state and a goal.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors