Thanks for your great work on Dual Memory Networks. I am trying to reproduce the results in Table 3 of your paper using the code from your GitHub repository.
However, I noticed some differences between the numbers in the provided log files and the paper. For example, on DTD, the log file camera_ready_dmn_tf_searched_vit_0.txt reports 54.61% accuracy, while the paper reports 55.85%. A similar ~1-2% gap is also observed on EuroSAT.
I would like to confirm if there are any differences in the experimental setup that I might have missed. Specifically, I am wondering about:
- Whether the reported results in the paper are averaged over multiple random seeds runs?
- Any specific data preprocessing steps (e.g., multi-scale testing, multi-crop, test-time augmentation) that might not be in the default code?
- Whether you used any ensemble of different models or hyperparameters for the final numbers?
- Any other details such as beta, selection_p, or memory_size that were tuned per dataset?
Thank you for your time and help!
Thanks for your great work on Dual Memory Networks. I am trying to reproduce the results in Table 3 of your paper using the code from your GitHub repository.
However, I noticed some differences between the numbers in the provided log files and the paper. For example, on DTD, the log file
camera_ready_dmn_tf_searched_vit_0.txtreports 54.61% accuracy, while the paper reports 55.85%. A similar ~1-2% gap is also observed on EuroSAT.I would like to confirm if there are any differences in the experimental setup that I might have missed. Specifically, I am wondering about:
Thank you for your time and help!