Skip to content

chthonn/AnchorDocs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AnchorDocs RAG

This project is a local RAG system built with TypeScript and Node.js. It ingests PDF and Markdown files, stores document chunks in Chroma, combines vector search with BM25 retrieval, reranks the results, and generates grounded answers with citations.

The goal is simple: answer questions from your own documents with evidence, instead of relying only on model memory.

What It Does

  • loads .pdf, .md, and .markdown files from data/raw
  • splits content into overlapping chunks
  • stores embeddings in Chroma
  • builds BM25 search from processed chunks
  • combines keyword and vector retrieval
  • reranks results before generation
  • returns answers with citations

Why This Exists

A plain LLM can answer fluently, but not always reliably from your source material. This project adds retrieval first, then asks the model to answer only from the retrieved context. That makes the output easier to inspect and more useful for document-based question answering.

How It Works

  1. Documents are loaded from data/raw.
  2. Text is normalized and chunked.
  3. Chunks are saved to data/processed/chunks.json.
  4. The same chunks are embedded and stored in Chroma.
  5. At query time, the system retrieves with Chroma and BM25.
  6. Results are fused, reranked, and passed to Ollama for the final answer.

Stack

  • Node.js
  • TypeScript
  • Express
  • Ollama
  • ChromaDB
  • LangChain
  • wink-bm25-text-search
  • @xenova/transformers

Setup

Install dependencies and copy the environment file:

npm install
cp .env.example .env

Default services expected locally:

  • Ollama at http://127.0.0.1:11434
  • Chroma at http://127.0.0.1:8000

Default Ollama models:

ollama pull llama3.1:8b
ollama pull nomic-embed-text

Add your source documents to data/raw.

Commands

Ingest documents:

npm run ingest

Query from the CLI:

npm run query -- "What does the document say about cancellations?"

Run the API server:

npm run dev

Build and run:

npm run build
npm start

Run evaluation:

npm run evaluate

API

GET /health

Returns the active model, embedding model, and collection.

POST /ingest

Runs ingestion and returns document and chunk counts.

POST /query

Example request:

curl -X POST http://localhost:3000/query \
  -H "Content-Type: application/json" \
  -d '{"question":"Summarize the cancellation terms with citations.","topK":6}'

Configuration

Main settings come from .env:

  • OLLAMA_BASE_URL
  • OLLAMA_LLM_MODEL
  • OLLAMA_EMBEDDING_MODEL
  • CHROMA_URL
  • CHROMA_COLLECTION_NAME
  • RAW_DOCS_DIR
  • PROCESSED_DOCS_DIR
  • DEFAULT_TOP_K
  • DEFAULT_VECTOR_K
  • DEFAULT_BM25_K
  • CHUNK_SIZE
  • CHUNK_OVERLAP
  • RERANKER_MODEL

Notes

  • data/processed/chunks.json is required because BM25 is built from it at runtime.
  • Chroma and the processed chunk file are both part of retrieval.
  • First query can be slower because the reranker loads lazily.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors