Skip to content

luffy06/RAG-Tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Tutorials

This repository contains a small tutorial-style codebase for several RAG fusion strategies. The current focus is not completeness, but getting each method into a minimal runnable form that can be extended later.

What is implemented

  • none: no retrieval augmentation, plain base model evaluation.
  • query: BM25 retrieval + prompt template insertion via {context}.
  • logits: BM25 retrieval + neighbor next-token aggregation + lambda * p + (1 - lambda) * q.
  • latent: FAISS retrieval + embedding-weighted latent vector + trainable projection injected into QKV modules.
  • parametric: document-level LoRA adapters trained from retrieved neighbor documents and fused at inference time.

Repository layout

  • src/main.py: unified inference entry for none/query/logits/latent/parametric.
  • src/train_latent.py: training script for latent fusion adapters.
  • src/train_parametric.py: training script for document-level parametric LoRA adapters.
  • src/fusion/: fusion implementations.
  • src/retriever/: BM25 and FAISS retrievers.
  • scripts/: ready-to-run shell scripts for training and inference.

Environment

The code is expected to run inside your existing ragdemo environment.

conda activate ragdemo

Inference scripts

Plain baseline:

bash scripts/run.sh

Query fusion:

bash scripts/run_query.sh

Logits fusion:

bash scripts/run_logits.sh

Latent fusion inference:

bash scripts/run_latent.sh

Parametric fusion inference:

bash scripts/run_parametric.sh

Training scripts

Train the latent fusion adapter:

bash scripts/train_latent.sh

Train the parametric document-level LoRA bank:

bash scripts/train_parametric.sh

Main CLI

The unified entrypoint is:

python -m src.main \
  --dataset hotpotqa/hotpot_qa \
  --config distractor \
  --split validation \
  --model-name Qwen/Qwen2.5-1.5B \
  --fusion none

Important arguments:

  • --fusion: one of none, query, logits, latent, parametric
  • --retriever: bm25 or faiss
  • --encoder-model-name: required for faiss
  • --user-prompt: prompt template; for query-style text insertion, use {context}
  • --top-k: number of retrieved neighbors
  • --max-samples: evaluate only a small subset for smoke tests
  • --logits-lambda: blend weight used by logits fusion
  • --latent-checkpoint: trained latent adapter checkpoint
  • --parametric-checkpoint: trained parametric adapter bank
  • --lora-rank, --lora-alpha: LoRA configuration for parametric fusion

Method notes

Query fusion

This path currently implements the simplest form: retrieve with BM25, insert retrieved text into the reserved {context} slot in the user prompt, and then run generation.

Example prompt template:

Use the retrieved context to answer the question.

Question: {question}

{context}

Answer:

Logits fusion

This is a minimal version. For each query, the model:

  1. retrieves neighbors with BM25
  2. builds one augmented prompt per neighbor
  3. reads the next-token distribution from each neighbor prompt
  4. weights neighbor targets by retrieval score
  5. blends the neighbor distribution with the base model distribution

Latent fusion

This path currently assumes:

  • retrieval uses FAISS with sentence-transformer embeddings
  • the base model is frozen
  • only the latent projection layers are trained
  • the weighted retrieval embedding is injected into QKV projection outputs

Parametric fusion

This path currently assumes:

  • each document owns one LoRA adapter
  • training uses retrieved neighbor documents to help reconstruct the target document
  • inference retrieves relevant documents, loads their adapters, and computes a weighted average adapter before generation

Status

This repository is still tutorial code. The implementations are intentionally simple and are meant to be iterated on method by method.

About

[Artificial Intelligence Review 2026] Retrieval-augmented generation for natural language processing: A survey

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors