Skip to content

sandeepkumar9760/Docuquery-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📚 RAG Document Q&A System

Author: Sandeep Kumar


What It Does

Upload any PDF → system chunks it → embeds with OpenAI → stores in FAISS vector store → you ask questions → retrieves relevant chunks → GPT-4o-mini generates grounded answers with source citations.


Key Results

Component Choice Why
Embeddings text-embedding-3-small Best cost/quality ratio
Vector Store FAISS Fast local similarity search
LLM GPT-4o-mini Fast, cheap, accurate
Chunking RecursiveCharacterTextSplitter Preserves sentence boundaries
Framework LangChain Industry standard RAG tooling

Project Structure

project4-rag/
├── app.py           ← Streamlit UI (upload, chat, sources)
├── rag_engine.py    ← Core RAG pipeline (chunk, embed, retrieve, generate)
├── .env             ← API keys (never commit this)
├── .gitignore
├── requirements.txt
└── README.md

Quickstart

1. Install dependencies

pip install -r requirements.txt

2. Add your OpenAI API key

Open .env and replace the placeholder:

OPENAI_API_KEY=sk-your-actual-key-here

3. Run the app

streamlit run app.py

4. Use it

  • Upload any PDF in the sidebar
  • Click Build Vector Store
  • Ask questions in the chat box

How RAG Works

PDF
 └─► PyPDFLoader (extract text per page)
      └─► RecursiveCharacterTextSplitter (chunk into ~500 char pieces)
           └─► OpenAI text-embedding-3-small (embed each chunk → vector)
                └─► FAISS (store all vectors locally)
                     └─► User asks question
                          └─► Embed question → similarity search → top-4 chunks
                               └─► GPT-4o-mini (answer grounded in chunks)
                                    └─► Display answer + source citations

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages