BioMistral Medical RAG Chatbot

Project Overview

Goal: Develop a chatbot to answer user queries about health concerns, symptoms, treatments, and more.
Purpose: Serve as a reliable and accessible resource for medical information and advice.
Learning Context: This project was completed as part of the learning stages of the GUVI SAWIT.AI Women-Only, Gen AI Learning Challenge.

Key Components

Llama Library: Utilized for working with language models.
Retriever: Gathers context-relevant information to enhance LLM responses.
Prompt Template: Combines retriever results and user queries for accurate response generation.

Technologies Used

Langchain: For building the chatbot framework.
Hugging Face Transformers: For implementing the LLM and embedding model.
PyPDF2: For loading and parsing PDF documents.

Implementation

Retrieval Augmented Generation (RAG) Chain:
- Integrates the retriever, LLM, and prompt template.
- Facilitates dynamic retrieval and generation of medical information.

Development Steps

1. Environment Setup

Set up the development environment on Google Colab.
Install required Python packages (Langchain, Sentence Transformers, Hugging Face, etc.).

2. Document Preprocessing

Import medical documents into the project environment.
Extract text using libraries like PyPDF2 for PDFs and docx for Word files.
Utilize Langchain’s text splitter to chunk the text into manageable segments for retrieval.

3. Creating Embeddings and Vector Store

Generate embeddings for text chunks using a pre-trained embedding model from Hugging Face.
Create a Chroma vector store for efficient storage and retrieval of embeddings.
Index text chunks with their corresponding embeddings for similarity search.

4. LLM Integration

Load a pre-trained LLM using the Llama library for generating informative responses.
Design a prompt template combining retrieved context and user queries.
Construct a Retrieval Augmented Generation (RAG) chain using Langchain’s Chain Class to integrate the retriever, LLM, and prompt template.

Testing

Conducted testing with various medical-related prompts.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BioMistral_Chatbot.ipynb		BioMistral_Chatbot.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BioMistral Medical RAG Chatbot

Project Overview

Key Components

Technologies Used

Implementation

Development Steps

1. Environment Setup

2. Document Preprocessing

3. Creating Embeddings and Vector Store

4. LLM Integration

Testing

Models and Data used

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BioMistral Medical RAG Chatbot

Project Overview

Key Components

Technologies Used

Implementation

Development Steps

1. Environment Setup

2. Document Preprocessing

3. Creating Embeddings and Vector Store

4. LLM Integration

Testing

Models and Data used

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages