Semantic search system powered by LangChain, PostgreSQL (pgVector) and Gemini.
This project implements a complete Retrieval-Augmented Generation (RAG) pipeline capable of ingesting a PDF document, storing embeddings in PostgreSQL with pgVector, and answering questions strictly based on the document content via CLI.
🎓 This repository contains an academic project developed during my MBA in Software Engineering with AI at Full Cycle.
Build a system capable of:
- Ingestion: Read a PDF file and store its content as vector embeddings in PostgreSQL using pgVector.
- Search: Allow users to ask questions via CLI and receive answers based only on the document content.
If the answer is not explicitly present in the document, the system responds:
"I do not have enough information to answer your question."
flowchart TD
A[PDF Document] --> B[PyPDFLoader]
B --> C[RecursiveCharacterTextSplitter]
C --> D[Generate Embeddings]
D --> E[(PostgreSQL + pgVector)]
F[User Question CLI] --> G[Question Embedding]
G --> H[Similarity Search k=10]
H --> E
H --> I[Retrieve Top Chunks]
I --> J[Prompt Template]
J --> K[LLM Gemini]
K --> L[Answer Returned to CLI]
- PDF is loaded
- Text is split into chunks (1000 chars, 150 overlap)
- Each chunk is converted into embeddings
- Vectors are stored in PostgreSQL (pgVector)
- User question is embedded
- Top 10 similar chunks are retrieved
- Prompt enforces strict context-based response
- LLM generates final answer
-
Language: Python
-
Framework: LangChain
-
Database: PostgreSQL + pgVector
-
Containerization: Docker & Docker Compose
-
Embeddings Models:
- Gemini →
gemini-embedding-001
- Gemini →
-
LLM Models:
- Gemini →
gemini-2.5-flash-lite
- Gemini →
├── data/pdf
│ ├── document.pdf
├── src/
│ ├── chat.py
│ ├── ingest.py
│ ├── llm_manager.py
│ ├── search.py
│ └── utils.py
├── docker-compose.yml
├── requirements.txt
├── .env.example
└── README.md
git clone <your-repo-url>
cd <repo-name>python3 -m venv venv
source venv/bin/activatepip install -r requirements.txtCreate .env based on .env.example
Example:
GOOGLE_API_KEY=your_google_key
GOOGLE_EMBEDDING_MODEL=models/embedding-001
GOOGLE_CHAT_MODEL='gemini-2.5-flash-lite'
DATABASE_URL=postgresql+psycopg://postgres:postgres@localhost:5432/rag
PG_VECTOR_COLLECTION_NAME=company_revenue_rag
PDF_PATH=./data/pdf/document.pdf
docker compose up -dpython src/ingest.py✔ Loads PDF ✔ Splits into chunks (1000 / 150 overlap) ✔ Generates embeddings ✔ Stores vectors in PostgreSQL
python src/chat.pyExample:
QUESTION: What is the revenue of the company Beta IA LTDA??
ANSWER: R$ 40.733.987,34
Out-of-context example:
QUESTION: What is the capital of France?
ANSWER: I do not have enough information to answer your question.
The system strictly:
- Uses only retrieved context
- Rejects external knowledge
- Prevents hallucinations
- Avoids opinion-based answers
- Returns fixed fallback response when necessary
✔ Chunk size = 1000 ✔ Overlap = 150 ✔ Similarity search (k=10) ✔ PostgreSQL + pgVector storage ✔ CLI interaction ✔ Strict prompt enforcement