📚 Ask-My-Documents

A production-grade Multimodal RAG (Retrieval-Augmented Generation) system for intelligent document understanding, semantic retrieval, and grounded question answering.

Overview

Ask-My-Documents is an advanced multimodal RAG pipeline designed to process complex documents such as research papers, technical PDFs, and enterprise documents. Unlike basic PDF chatbot implementations, this project focuses on retrieval quality, multimodal understanding, and production-style document ingestion pipelines.

The system combines:

Semantic chunking — title-aware, context-preserving document segmentation
OCR-aware extraction — robust handling of scanned and image-heavy documents
Table-aware parsing — structured preservation of tabular data
Image-aware understanding — vision-language model integration for multimodal content
AI-powered semantic enrichment — LLM-augmented chunk metadata
Vector retrieval pipelines — high-performance semantic search via ChromaDB

Features

Feature	Description
📄 PDF Ingestion	Parse and extract content from complex PDFs
🧠 Semantic Enrichment	AI-enhanced chunk-level understanding
🖼️ Multimodal Understanding	Vision-Language Model integration (Qwen2-VL)
📊 Table Extraction	Structure-preserving table parsing
🔎 Vector Retrieval	Semantic search with ChromaDB
🧩 Title-Aware Chunking	Context-coherent document segmentation
📚 Grounded Generation	Source-cited answer generation
🏷️ Metadata-Aware Retrieval	Rich metadata for precision filtering
🚀 Production Architecture	Modular, extensible ingestion pipeline

Tech Stack

Component	Technology
Document Parsing	Unstructured
Embeddings	`BAAI/bge-small-en-v1.5`
Vector Database	ChromaDB
Multimodal Model	Qwen2-VL
Framework	LangChain
OCR	Tesseract OCR
PDF Processing	Poppler
Backend	Python 3.10+

System Architecture

Document Upload
      │
      ▼
Document Parsing (Unstructured)
      │
      ▼
Semantic Element Extraction
      │
      ▼
Title-Aware Chunking
      │
      ▼
Multimodal Content Extraction
      │
      ▼
AI Semantic Enrichment
      │
      ▼
Embeddings Generation
      │
      ▼
ChromaDB Vector Storage
      │
      ▼
Hybrid Retrieval + Re-ranking
      │
      ▼
Grounded Answer Generation

Project Structure

ask-my-documents/
│
├── data/
├── notebooks/
├── src/
│   ├── ingestion/
│   ├── chunking/
│   ├── embeddings/
│   ├── retrieval/
│   ├── llm/
│   ├── evaluation/
│   └── utils/
│
├── vector_db/
├── app/
├── requirements.txt
└── README.md

Installation

1. Clone the Repository

git clone https://github.com/your-username/ask-my-documents.git
cd ask-my-documents

2. Create a Virtual Environment

python -m venv venv

# Windows
venv\Scripts\activate

# Linux / macOS
source venv/bin/activate

3. Install Python Dependencies

pip install -r requirements.txt

4. Install System Dependencies

Linux:

sudo apt-get install poppler-utils tesseract-ocr

Windows:

Download and install:

Then add both to your system PATH.

Current Capabilities

Semantic document chunking
Multimodal preprocessing pipeline
OCR-aware extraction
Table-aware document understanding
Image-aware retrieval enrichment
Vision-language model integration
Retrieval-ready semantic indexing

Roadmap

Example Use Cases

📖 Research Paper Assistant — Query academic papers with grounded citations
🛠️ Technical Documentation QA — Instant answers from complex technical manuals
🏢 Enterprise Document Search — Retrieve insights from internal knowledge bases
📊 Table-Aware QA — Ask questions directly about tabular data
🌐 Multimodal Knowledge Retrieval — Combine text and image understanding

License

This project is licensed under the MIT License.

Author

Developed as part of an advanced RAG engineering and multimodal retrieval learning journey.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Ask-My-Documents

Overview

Features

Tech Stack

System Architecture

Project Structure

Installation

1. Clone the Repository

2. Create a Virtual Environment

3. Install Python Dependencies

4. Install System Dependencies

Current Capabilities

Roadmap

Example Use Cases

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 Ask-My-Documents

Overview

Features

Tech Stack

System Architecture

Project Structure

Installation

1. Clone the Repository

2. Create a Virtual Environment

3. Install Python Dependencies

4. Install System Dependencies

Current Capabilities

Roadmap

Example Use Cases

License

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages