Hi! Thanks for presenting such a solid work!
I am curious how your evaluation datasets are organized, e.g. how many videos are selected from the LOVEU-TGVE/UCF Sports Action data set, and how the original prompt/target prompts are defined. Could you share your organized evaluation dataset?
Hi! Thanks for presenting such a solid work!
I am curious how your evaluation datasets are organized, e.g. how many videos are selected from the LOVEU-TGVE/UCF Sports Action data set, and how the original prompt/target prompts are defined. Could you share your organized evaluation dataset?