Thank you for your great work! There are different numbers of captions for every video in MSVD dataset, such as 29, 42……But I found that the shape of msvd_train_evalscores.pkl is [1200,17], why there are only 17 captions' scores for every video in training set?
Thank you for your great work! There are different numbers of captions for every video in
MSVDdataset, such as 29, 42……But I found that the shape of msvd_train_evalscores.pkl is [1200,17], why there are only 17 captions' scores for every video in training set?