AI PDF Chatbot

A full-stack AI chatbot that ingests PDF documents, stores embeddings in a vector database, and answers questions using RAG (Retrieval-Augmented Generation). Built with LangChain, LangGraph, Next.js, and Supabase.

Features

PDF Ingestion — Upload and parse PDFs, store vector embeddings in Supabase
Smart Query Routing — Automatically decides whether to retrieve documents or answer directly
Streaming Responses — Real-time SSE-based chat with response chunks
Source Citations — View which documents were used to generate answers
Multi-turn Conversation — Message history preserved across turns

Architecture

Frontend (Next.js)  ──>  Backend (LangGraph Server)  ──>  Supabase (Vector Store)
     │                         │
     │  Upload PDFs ──>  Ingestion Graph (embed + store)
     │  Ask questions ──>  Retrieval Graph (route → retrieve → generate)
     │                         │
     └── SSE Stream <──────────┘

Backend: LangGraph agent graphs for ingestion and retrieval
Frontend: Next.js/React chat UI with file upload
Vector Store: Supabase with OpenAI embeddings (text-embedding-3-small)
LLM: OpenAI GPT-4o-mini (configurable)

Prerequisites

Node.js v20+
Yarn
Supabase project with documents table and match_documents function (setup guide)
OpenAI API key

Setup

Clone the repo:

git clone https://github.com/stevez/pdf-chatbot.git
cd pdf-chatbot

Install dependencies:

yarn install

Configure environment variables:

Backend (backend/.env):

OPENAI_API_KEY=your-openai-api-key
SUPABASE_URL=your-supabase-url
SUPABASE_SERVICE_ROLE_KEY=your-supabase-service-role-key

Frontend (frontend/.env):

NEXT_PUBLIC_LANGGRAPH_API_URL=http://localhost:2024

Running Locally

Start the backend (LangGraph server on port 2024):

cd backend
yarn langgraph:dev

Start the frontend (Next.js on port 3000):

cd frontend
yarn dev

Open http://localhost:3000, upload a PDF, and start asking questions.

Configuration

LLM model: frontend/constants/graphConfigs.ts — change queryModel
Retrieval k value: frontend/constants/graphConfigs.ts — change k
Prompts: backend/src/retrieval_graph/prompts.ts
Vector store: backend/src/shared/retrieval.ts

Tech Stack

LangChain / LangGraph — LLM orchestration
Next.js — Frontend framework
Supabase — Vector database
OpenAI — Embeddings and chat model
Turborepo — Monorepo management

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
scripts		scripts
.dockerignore		.dockerignore
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.prettierrc		.prettierrc
LICENSE		LICENSE
README.md		README.md
design-document.pdf		design-document.pdf
package.json		package.json
turbo.json		turbo.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI PDF Chatbot

Features

Architecture

Prerequisites

Setup

Running Locally

Configuration

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI PDF Chatbot

Features

Architecture

Prerequisites

Setup

Running Locally

Configuration

Tech Stack

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages