-
Notifications
You must be signed in to change notification settings - Fork 81
Missing eprint support in BibTeX style breaks Google Scholar indexing for arXiv citations #5
Description
Hi!
I am writing to report a critical bug with the current BibTeX style file (iclr2026_conference.bst, but not only for 2026 - it seems that all the previous versions are also involved) regarding the handling of arXiv preprints.
The Issue:
The current .bst file does not support the eprint, archivePrefix, or primaryClass fields. However, the standard BibTeX export provided by arXiv.org relies heavily on these fields and often does not include a url field by default.
Consequences:
When authors cite arXiv papers using the standard export, the generated PDF omits the arXiv ID entirely. It renders only the Author, Title, and Year. Because the arXiv ID is missing from the PDF text, automated indexing systems (like Google Scholar and Semantic Scholar) fail to recognize the citation correctly. This leads to broken citation graphs and "lost" citations for the community.
Example:
Last year we had a paper at ICLR 2025 and used 2025 year template version. We cited the paper called "Caduceus: Bi-directional equivariant long-range dna sequence modeling" and used standard arXiv citation format via standard arXiv interface "Export BibTeX Citation":
@misc{schiff2024caduceusbidirectionalequivariantlongrange,
title={Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling},
author={Yair Schiff and Chia-Hsiang Kao and Aaron Gokaslan and Tri Dao and Albert Gu and Volodymyr Kuleshov},
year={2024},
eprint={2403.03234},
archivePrefix={arXiv},
primaryClass={q-bio.GN},
url={https://arxiv.org/abs/2403.03234},
}
In the paper itself the citation looks the next way:
However, in such way the citation is not visible in academic search engines/databases like Google Scholar/Semantic Scholar/etc. They simply cannot parse such type of citation, so any authors whose papers were cited from arXiv cannot see that their papers were actually cited upon the ICLR conference/workshops template.
Proposed Solution:
I have patched the .bst file using the standard urlbst utility (the same tool used to generate ACL/EMNLP style files).
The fix adds support for:
- eprint and archivePrefix fields (rendering as arXiv:ID).
- Modern doi field handling (making DOIs clickable).
- Backwards compatibility (entries with manual url fields still work).
After this fix, these arXiv references should look like this:
Or for the papers where doi is not available in BibTeX cite:
I also submit a Pull Request with the patched iclr2026_conference.bst. This could potentially help to solve the issue with the arXiv references and citations. Feel free to contact me in any way to correct my solution or propose even more advanced one.
Best regards,
Iaroslav Chelombitko