A full-stack AI chatbot that ingests PDF documents, stores embeddings in a vector database, and answers questions using RAG (Retrieval-Augmented Generation). Built with LangChain, LangGraph, Next.js, and Supabase.
- PDF Ingestion — Upload and parse PDFs, store vector embeddings in Supabase
- Smart Query Routing — Automatically decides whether to retrieve documents or answer directly
- Streaming Responses — Real-time SSE-based chat with response chunks
- Source Citations — View which documents were used to generate answers
- Multi-turn Conversation — Message history preserved across turns
Frontend (Next.js) ──> Backend (LangGraph Server) ──> Supabase (Vector Store)
│ │
│ Upload PDFs ──> Ingestion Graph (embed + store)
│ Ask questions ──> Retrieval Graph (route → retrieve → generate)
│ │
└── SSE Stream <──────────┘
- Backend: LangGraph agent graphs for ingestion and retrieval
- Frontend: Next.js/React chat UI with file upload
- Vector Store: Supabase with OpenAI embeddings (
text-embedding-3-small) - LLM: OpenAI GPT-4o-mini (configurable)
- Node.js v20+
- Yarn
- Supabase project with
documentstable andmatch_documentsfunction (setup guide) - OpenAI API key
- Clone the repo:
git clone https://github.com/stevez/pdf-chatbot.git
cd pdf-chatbot- Install dependencies:
yarn install- Configure environment variables:
Backend (backend/.env):
OPENAI_API_KEY=your-openai-api-key
SUPABASE_URL=your-supabase-url
SUPABASE_SERVICE_ROLE_KEY=your-supabase-service-role-key
Frontend (frontend/.env):
NEXT_PUBLIC_LANGGRAPH_API_URL=http://localhost:2024
Start the backend (LangGraph server on port 2024):
cd backend
yarn langgraph:devStart the frontend (Next.js on port 3000):
cd frontend
yarn devOpen http://localhost:3000, upload a PDF, and start asking questions.
- LLM model:
frontend/constants/graphConfigs.ts— changequeryModel - Retrieval k value:
frontend/constants/graphConfigs.ts— changek - Prompts:
backend/src/retrieval_graph/prompts.ts - Vector store:
backend/src/shared/retrieval.ts