Skip to content

fayzan101/Health-Care-Rag-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Healthcare RAG System Backend

A comprehensive Retrieval-Augmented Generation (RAG) system built with FastAPI for healthcare document analysis and medical question answering.

๐Ÿš€ Features

  • User Authentication: JWT-based authentication for patients and doctors
  • Document Management: Upload and process medical documents (PDF, TXT)
  • Vector Database: FAISS-based vector storage for document embeddings
  • RAG Pipeline: LangChain-powered question answering with source retrieval
  • Conversation History: Maintain context-aware chat sessions
  • Role-based Access: Different permissions for patients and doctors
  • Async API: High-performance asynchronous endpoints
  • CORS Support: Frontend-friendly with configurable origins

๐Ÿ—๏ธ Architecture

app/
โ”œโ”€โ”€ __init__.py
โ”œโ”€โ”€ config.py          # Configuration and environment variables
โ”œโ”€โ”€ database.py        # Database connection and session management
โ”œโ”€โ”€ models.py          # SQLAlchemy database models
โ”œโ”€โ”€ schemas.py         # Pydantic request/response schemas
โ”œโ”€โ”€ auth.py            # JWT authentication and security
โ”œโ”€โ”€ services/          # Business logic layer
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ rag_service.py      # RAG pipeline and vector operations
โ”‚   โ”œโ”€โ”€ user_service.py     # User management operations
โ”‚   โ”œโ”€โ”€ document_service.py # Document processing and storage
โ”‚   โ””โ”€โ”€ conversation_service.py # Chat and conversation management
โ””โ”€โ”€ routers/           # API endpoint definitions
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ auth.py        # Authentication endpoints
    โ”œโ”€โ”€ users.py       # User management endpoints
    โ”œโ”€โ”€ documents.py   # Document upload/management endpoints
    โ””โ”€โ”€ rag.py         # RAG and conversation endpoints

๐Ÿ› ๏ธ Technology Stack

  • Backend Framework: FastAPI (Python)
  • Database: SQLite (configurable to PostgreSQL)
  • ORM: SQLAlchemy 2.0
  • Authentication: JWT with python-jose
  • Vector Database: FAISS (Facebook AI Similarity Search)
  • Document Processing: LangChain, PyPDF2
  • Embeddings: Sentence Transformers (HuggingFace)
  • Password Hashing: bcrypt
  • API Documentation: Auto-generated with FastAPI

๐Ÿ“‹ Prerequisites

  • Python 3.8+
  • Virtual environment (recommended)
  • Git

๐Ÿš€ Installation

  1. Clone the repository

    git clone <repository-url>
    cd healthcare-rag-system
  2. Create and activate virtual environment

    python -m venv venv
    # On Windows
    venv\Scripts\activate
    # On macOS/Linux
    source venv/bin/activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Environment Configuration

    # Copy environment template
    cp env.example .env
    
    # Edit .env file with your configuration
    # Update SECRET_KEY, OPENAI_API_KEY, etc.
  5. Run the application

    python main.py

    The API will be available at http://localhost:8000

๐Ÿ“š API Documentation

Once running, access the interactive API documentation:

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

๐Ÿ” Authentication

The system uses JWT tokens for authentication. Include the token in the Authorization header:

Authorization: Bearer <your-jwt-token>

Authentication Flow

  1. Register: POST /api/v1/auth/register
  2. Login: POST /api/v1/auth/login
  3. Use Token: Include in subsequent requests

๐Ÿ“ API Endpoints

Authentication (/api/v1/auth)

  • POST /register - User registration
  • POST /login - User authentication
  • GET /me - Get current user info
  • POST /logout - User logout

Users (/api/v1/users)

  • GET / - List all users (doctors only)
  • GET /{user_id} - Get user by ID
  • PUT /{user_id} - Update user
  • DELETE /{user_id} - Deactivate user
  • GET /profile/me - Get own profile
  • PUT /profile/me - Update own profile

Documents (/api/v1/documents)

  • POST /upload - Upload medical document
  • GET / - List user's documents
  • GET /{document_id} - Get document details
  • DELETE /{document_id} - Delete document
  • POST /{document_id}/process - Process document manually
  • GET /stats/summary - Document statistics
  • GET /{document_id}/chunks - Get document chunks

RAG (/api/v1/rag)

  • POST /ask - Ask medical question
  • GET /conversations - List conversations
  • GET /conversations/{id} - Get conversation
  • GET /conversations/{id}/messages - Get conversation messages
  • DELETE /conversations/{id} - Delete conversation
  • PUT /conversations/{id}/title - Update conversation title
  • GET /conversations/summary - Conversation summary
  • POST /conversations/new - Create new conversation

๐Ÿ”„ Workflow Example

1. User Registration

curl -X POST "http://localhost:8000/api/v1/auth/register" \
  -H "Content-Type: application/json" \
  -d '{
    "email": "doctor@example.com",
    "username": "dr_smith",
    "password": "secure_password",
    "full_name": "Dr. John Smith",
    "is_doctor": true
  }'

2. User Login

curl -X POST "http://localhost:8000/api/v1/auth/login" \
  -H "Content-Type: application/json" \
  -d '{
    "username": "dr_smith",
    "password": "secure_password"
  }'

3. Upload Medical Document

curl -X POST "http://localhost:8000/api/v1/documents/upload" \
  -H "Authorization: Bearer <your-jwt-token>" \
  -F "file=@medical_report.pdf"

4. Ask Medical Question

curl -X POST "http://localhost:8000/api/v1/rag/ask" \
  -H "Authorization: Bearer <your-jwt-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What are the side effects of Metformin?"
  }'

โš™๏ธ Configuration

Environment Variables

Variable Description Default
DATABASE_URL Database connection string sqlite:///./healthcare_rag.db
SECRET_KEY JWT secret key your-secret-key-here
ALGORITHM JWT algorithm HS256
ACCESS_TOKEN_EXPIRE_MINUTES Token expiration time 30
OPENAI_API_KEY OpenAI API key for LLM ``
EMBEDDING_MODEL_NAME HuggingFace model name all-MiniLM-L6-v2
CHUNK_SIZE Document chunk size 1000
CHUNK_OVERLAP Chunk overlap size 200
HOST Server host 0.0.0.0
PORT Server port 8000
DEBUG Debug mode True
ALLOWED_ORIGINS CORS allowed origins ["http://localhost:3000"]

๐Ÿ”ง Development

Project Structure

โ”œโ”€โ”€ main.py                 # FastAPI application entry point
โ”œโ”€โ”€ requirements.txt        # Python dependencies
โ”œโ”€โ”€ env.example            # Environment variables template
โ”œโ”€โ”€ app/                   # Application package
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ config.py          # Configuration management
โ”‚   โ”œโ”€โ”€ database.py        # Database setup
โ”‚   โ”œโ”€โ”€ models.py          # Database models
โ”‚   โ”œโ”€โ”€ schemas.py         # Pydantic schemas
โ”‚   โ”œโ”€โ”€ auth.py            # Authentication utilities
โ”‚   โ”œโ”€โ”€ services/          # Business logic services
โ”‚   โ””โ”€โ”€ routers/           # API route handlers
โ”œโ”€โ”€ uploads/               # File upload directory
โ”œโ”€โ”€ vector_store/          # FAISS vector store
โ””โ”€โ”€ healthcare_rag.db      # SQLite database

Adding New Features

  1. New Model: Add to app/models.py
  2. New Schema: Add to app/schemas.py
  3. New Service: Create in app/services/
  4. New Endpoint: Add to appropriate router in app/routers/

Database Migrations

The system uses SQLAlchemy with automatic table creation. For production, consider using Alembic for migrations.

๐Ÿš€ Deployment

Production Considerations

  1. Environment Variables: Set proper production values
  2. Database: Use PostgreSQL instead of SQLite
  3. Security: Change default secret keys
  4. CORS: Configure allowed origins properly
  5. File Storage: Use cloud storage (S3, Azure Blob) instead of local files
  6. Vector Store: Consider cloud vector databases (Pinecone, Weaviate)

Docker Deployment

FROM python:3.8-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

EXPOSE 8000

CMD ["python", "main.py"]

๐Ÿงช Testing

Manual Testing

  1. Start the application
  2. Use the interactive docs at /docs
  3. Test endpoints with sample data

Automated Testing

# Install test dependencies
pip install pytest pytest-asyncio httpx

# Run tests
pytest

๐Ÿค Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

โš ๏ธ Disclaimer

This system is for educational and research purposes. Medical information should not be used as a substitute for professional medical advice. Always consult with qualified healthcare professionals.

๐Ÿ†˜ Support

For issues and questions:

  1. Check the API documentation at /docs
  2. Review the logs for error messages
  3. Open an issue on the repository

๐Ÿ”ฎ Future Enhancements

  • Integration with actual LLM APIs (OpenAI GPT, Claude)
  • Support for more document formats (DOCX, images)
  • Advanced search and filtering
  • User analytics and insights
  • Multi-tenant architecture
  • Real-time notifications
  • Mobile app support
  • HIPAA compliance features

About

Developed a Healthcare RAG (Retrieval-Augmented Generation) backend using Python for intelligent medical information retrieval and response generation. Integrated document ingestion, vector embeddings, and semantic search to fetch relevant healthcare data efficiently. Built APIs for query processing and LLM-based response generation to assist in me

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors