Skip to content

BABAR-TAHSEEN55/GenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GenAI Project

A comprehensive collection of Generative AI implementations and utilities covering core concepts including agents, RAG systems, embeddings, tokenization, and multimodal processing.

Project Overview

This repository demonstrates practical implementations of key GenAI technologies with hands-on examples and production-ready code patterns. Each module is designed to be educational yet functional, providing clear examples of modern AI development practices.

Project Structure

Agents

An intelligent AI agent system with tool integration capabilities featuring:

  • Multi-step reasoning workflow: Implements start → plan → action → observe → output pattern
  • Weather API integration: Real-time weather data retrieval
  • System command execution: Secure Linux command execution
  • OpenAI GPT-4 powered: Advanced language model integration
  • Structured JSON responses: Clean, parseable output format
  • Interactive CLI interface: User-friendly command-line interaction

Key Features:

  • Tool calling architecture with extensible tool registry
  • Step-by-step execution with intermediate feedback
  • Error handling and observation loops
  • Modular design for easy tool addition

RAG (Retrieval-Augmented Generation)

Document-based question answering system with vector search capabilities:

  • PDF document processing: Automated text extraction and chunking
  • Vector embeddings: OpenAI text-embedding-3-large integration
  • Qdrant vector database: High-performance similarity search
  • Docker containerization: Easy deployment and scaling
  • Chunking strategies: Recursive text splitting with overlap
  • Similarity search: Efficient document retrieval

Architecture:

  • Document indexing pipeline (rag/indexing.py)
  • Interactive chat interface (rag/chat.py)
  • Containerized vector database setup
  • Scalable document processing workflow

Embeddings

Text embedding generation and advanced prompt engineering utilities:

  • OpenAI embeddings: text-embedding-3-small model integration
  • Zero-shot prompting: Direct instruction-following examples
  • Few-shot prompting: Context-aware learning demonstrations
  • Chain-of-thought reasoning: Step-by-step problem solving
  • Grammar correction assistant: Automated text refinement
  • Autonomous conversation bot: Self-directed AI interactions

Prompt Engineering Techniques:

  • System prompt optimization
  • Multi-turn conversation handling
  • JSON-structured responses
  • Validation and refinement workflows

Tokenization

Text tokenization utilities using tiktoken for GPT-compatible processing:

  • GPT-4 compatible tokenization: Official OpenAI tokenizer
  • Token counting and analysis: Precise token usage tracking
  • Text preprocessing: Model-ready text preparation
  • Encoding/decoding utilities: Bidirectional text processing

MultiModal-RAG

Advanced document processing with multimodal capabilities:

  • Intelligent PDF routing: Automatic processing strategy selection
  • OCR integration: Scanned document text extraction
  • Image extraction: Automated image parsing from PDFs
  • Table detection: Structured data extraction
  • Layout analysis: Document structure understanding
  • Hybrid processing: Combines Unstructured.io and Docling libraries

Processing Strategies:

  • Docling: For text-heavy documents
  • Unstructured OCR: For scanned/image-heavy documents
  • Unstructured Layout: For complex layouts with many images

Prerequisites

  • Python 3.13+ (specified in all pyproject.toml files)
  • OpenAI API key (for LLM and embedding services)
  • Weather API key (for Agents module - WeatherAPI.com)
  • Docker & Docker Compose (for RAG vector database)

Installation & Setup

1. Clone the Repository

git clone <repository-url>
cd GenAI

2. Module-Specific Setup

Each module is self-contained with its own dependencies:

# Navigate to desired module
cd <module-name>

# Install dependencies using uv
uv sync

# Configure environment variables
cp .env.example .env  # Edit with your API keys

3. Environment Configuration

Create a .env file in each module directory:

# Required for all modules
OPENAI_API_KEY=your_openai_api_key_here

# Required for Agents module
WEATHER_API=http://api.weatherapi.com/v1/current.json?key=YOUR_KEY

# Optional: Model configurations
OPENAI_MODEL=gpt-4o-mini
EMBEDDING_MODEL=text-embedding-3-large

4. RAG Module Docker Setup

cd RAG
docker-compose up -d  # Starts Qdrant vector database
python -m rag.indexing  # Index documents
python -m rag.chat     # Start chat interface

Usage Examples

Agents Module

cd Agents
python main.py
> What's the weather in London?
# Agent will plan, execute weather API call, and respond

RAG Module

cd RAG
# First, index your documents
python -m rag.indexing
# Then start chatting
python -m rag.chat
> What is set theory?

Embeddings Module

cd embeddings
python main.py
# Demonstrates various prompting techniques and embedding generation

MultiModal RAG Module

cd MultiModal-RAG
python -m rag.index
# Automatically processes PDFs with optimal strategy

Architecture Patterns

Agent Architecture

  • Tool Registry Pattern: Extensible function calling system
  • State Machine: Clear workflow transitions
  • Observer Pattern: Step-by-step execution monitoring

RAG Architecture

  • Pipeline Pattern: Document processing → Embedding → Storage → Retrieval
  • Microservices: Containerized vector database
  • Separation of Concerns: Indexing vs querying logic

Prompt Engineering

  • Template Pattern: Reusable prompt structures
  • Chain of Thought: Structured reasoning workflows
  • Validation Loops: Self-correcting AI responses

Dependencies

Core Dependencies

  • OpenAI: Official Python client for GPT models and embeddings
  • python-dotenv: Environment variable management
  • requests: HTTP client for API interactions

RAG-Specific

  • langchain-community: Document loaders and utilities
  • langchain-openai: OpenAI integration for LangChain
  • langchain-qdrant: Vector database integration
  • pypdf: PDF processing capabilities

MultiModal Processing

  • docling: Advanced document understanding
  • unstructured: Multimodal document parsing
  • python-magic: File type detection

Tokenization

  • tiktoken: Official OpenAI tokenizer

Development Guidelines

Code Organization

  • Each module is independently runnable
  • Shared patterns across modules for consistency
  • Environment-based configuration
  • Clear separation between utilities and main logic

Best Practices

  • Use type hints for better code documentation
  • Environment variable validation
  • Error handling with informative messages
  • Modular design for easy extension

Contributing

  1. Choose a module to work on
  2. Follow the existing code patterns
  3. Update relevant documentation
  4. Test your changes thoroughly
  5. Submit pull requests with clear descriptions

License

[Add your license information here]

Additional Resources

Notes

  • Ensure API keys are properly configured before running modules
  • RAG module requires Docker for vector database
  • MultiModal-RAG processing can be resource-intensive for large PDFs
  • Monitor token usage when working with OpenAI APIs

Releases

No releases published

Packages

 
 
 

Contributors

Languages