AnchorDocs RAG

This project is a local RAG system built with TypeScript and Node.js. It ingests PDF and Markdown files, stores document chunks in Chroma, combines vector search with BM25 retrieval, reranks the results, and generates grounded answers with citations.

The goal is simple: answer questions from your own documents with evidence, instead of relying only on model memory.

What It Does

loads .pdf, .md, and .markdown files from data/raw
splits content into overlapping chunks
stores embeddings in Chroma
builds BM25 search from processed chunks
combines keyword and vector retrieval
reranks results before generation
returns answers with citations

Why This Exists

A plain LLM can answer fluently, but not always reliably from your source material. This project adds retrieval first, then asks the model to answer only from the retrieved context. That makes the output easier to inspect and more useful for document-based question answering.

How It Works

Documents are loaded from data/raw.
Text is normalized and chunked.
Chunks are saved to data/processed/chunks.json.
The same chunks are embedded and stored in Chroma.
At query time, the system retrieves with Chroma and BM25.
Results are fused, reranked, and passed to Ollama for the final answer.

Stack

Node.js
TypeScript
Express
Ollama
ChromaDB
LangChain
wink-bm25-text-search
@xenova/transformers

Setup

Install dependencies and copy the environment file:

npm install
cp .env.example .env

Default services expected locally:

Ollama at http://127.0.0.1:11434
Chroma at http://127.0.0.1:8000

Default Ollama models:

ollama pull llama3.1:8b
ollama pull nomic-embed-text

Add your source documents to data/raw.

Commands

Ingest documents:

npm run ingest

Query from the CLI:

npm run query -- "What does the document say about cancellations?"

Run the API server:

npm run dev

Build and run:

npm run build
npm start

Run evaluation:

npm run evaluate

API

`GET /health`

Returns the active model, embedding model, and collection.

`POST /ingest`

Runs ingestion and returns document and chunk counts.

`POST /query`

Example request:

curl -X POST http://localhost:3000/query \
  -H "Content-Type: application/json" \
  -d '{"question":"Summarize the cancellation terms with citations.","topK":6}'

Configuration

Main settings come from .env:

OLLAMA_BASE_URL
OLLAMA_LLM_MODEL
OLLAMA_EMBEDDING_MODEL
CHROMA_URL
CHROMA_COLLECTION_NAME
RAW_DOCS_DIR
PROCESSED_DOCS_DIR
DEFAULT_TOP_K
DEFAULT_VECTOR_K
DEFAULT_BM25_K
CHUNK_SIZE
CHUNK_OVERLAP
RERANKER_MODEL

Notes

data/processed/chunks.json is required because BM25 is built from it at runtime.
Chroma and the processed chunk file are both part of retrieval.
First query can be slower because the reranker loads lazily.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AnchorDocs RAG

What It Does

Why This Exists

How It Works

Stack

Setup

Commands

API

`GET /health`

`POST /ingest`

`POST /query`

Configuration

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AnchorDocs RAG

What It Does

Why This Exists

How It Works

Stack

Setup

Commands

API

GET /health

POST /ingest

POST /query

Configuration

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /health`

`POST /ingest`

`POST /query`

Packages