You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor the homegrown paper fetch layer in src/knowledge/papers_sync.py to use neuromechanist/opencite (published on PyPI), and declare the dependency so GitHub's dependency graph attributes OSA as a downstream of opencite.
Why
papers_sync.py hand-rolls fetching from 3 sources (OpenAlex via pyalex, Semantic Scholar via httpx, PubMed via E-utilities XML). opencite is a maintained superset: 10+ deduplicated sources, a rich Paper model, citation-graph traversal, BibTeX, and PDF retrieval. opencite was inspired by this code and will be the maintained home for paper tooling.
Scope (this issue: sync/fetch layer only)
Replace sync_openalex_papers / sync_semanticscholar_papers / sync_pubmed_papers fetching with opencite's SearchOrchestrator.search(...).
Replace sync_citing_papers with opencite's CitationExplorer.citing_papers(...).
Keep the local SQLite + FTS store and upsert_paper(...) write path unchanged.
Map OSA's configured API keys (OpenAlex/S2/PubMed) into opencite's Config.
Bridge opencite's async API to the existing sync sync-pipeline call sites.
Attribution
Add opencite>=<latest> to the server optional-dependencies in pyproject.toml. GitHub's dependency graph reads pyproject manifests, so OSA shows up under opencite's "Dependents"/"Used by".
Out of scope (tracked separately)
Live on-demand "search most recent papers" feature.
Exposing citation-graph / canonical / BibTeX as new agent tools.
Acceptance
Paper sync produces equivalent-or-better coverage for existing community queries (HED, EEGLAB) with no schema change.
opencite declared as a dependency and resolvable via uv sync --extra server.
Real tests (no mocks) covering the opencite -> upsert_paper mapping.
Goal
Refactor the homegrown paper fetch layer in
src/knowledge/papers_sync.pyto use neuromechanist/opencite (published on PyPI), and declare the dependency so GitHub's dependency graph attributes OSA as a downstream of opencite.Why
papers_sync.pyhand-rolls fetching from 3 sources (OpenAlex via pyalex, Semantic Scholar via httpx, PubMed via E-utilities XML). opencite is a maintained superset: 10+ deduplicated sources, a richPapermodel, citation-graph traversal, BibTeX, and PDF retrieval. opencite was inspired by this code and will be the maintained home for paper tooling.Scope (this issue: sync/fetch layer only)
sync_openalex_papers/sync_semanticscholar_papers/sync_pubmed_papersfetching with opencite'sSearchOrchestrator.search(...).sync_citing_paperswith opencite'sCitationExplorer.citing_papers(...).upsert_paper(...)write path unchanged.search_<community>_paperstool unchanged (its retrieval bug is fixed separately in fix(search): paper/knowledge search returns nothing for multi-word queries #305/fix(search): match multi-word FTS queries #306).Attribution
opencite>=<latest>to theserveroptional-dependencies inpyproject.toml. GitHub's dependency graph reads pyproject manifests, so OSA shows up under opencite's "Dependents"/"Used by".Out of scope (tracked separately)
Acceptance
opencitedeclared as a dependency and resolvable viauv sync --extra server.upsert_papermapping.