Thanks for your interesting work and for sharing the code.
In the README, you only provide examples of how to generate captions for one image at a time (batch size = 1). Could you (@Yushi-Hu) explain how to generate captions in batches (multiple questions and corresponding images) in one go, instead of iteratively calling the model to improve time efficiency?
Thanks for your interesting work and for sharing the code.
In the README, you only provide examples of how to generate captions for one image at a time (batch size = 1). Could you (@Yushi-Hu) explain how to generate captions in batches (multiple questions and corresponding images) in one go, instead of iteratively calling the model to improve time efficiency?