661K-vector semantic indexing across 6 datasets. e5-large-v2 embeddings, FAISS IndexFlatIP. Foundational batch (superseded by batch-02).
-
Updated
Jan 28, 2026 - HTML
661K-vector semantic indexing across 6 datasets. e5-large-v2 embeddings, FAISS IndexFlatIP. Foundational batch (superseded by batch-02).
Semantic search framework for research archives. FAISS indices, e5-large-v2 embeddings, 4,600+ docs, 10 institutions, methodology docs.
Large-scale semantic indexing pipelines producing 8.35M+ vectors across Wikipedia, ArXiv, and StackExchange using e5-large-v2 and FAISS.
Add a description, image, and links to the e5-embeddings topic page so that developers can more easily learn about it.
To associate your repository with the e5-embeddings topic, visit your repo's landing page and select "manage topics."