Rigorous evaluation of contextual retrieval techniques on FinanceBench: comparing 5 embedders × 4 chunking strategies with bootstrapped confidence intervals on FinMTEB and FinanceBench.
python benchmarking natural-language-processing information-retrieval pytorch embeddings semantic-search rag vector-search huggingface sentence-transformers retrieval-augmented-generation llm-evaluation contextual-retrieval late-chunking finance-nlp financebench finmteb
-
Updated
May 12, 2026 - Jupyter Notebook