A fully local, multimodal AI chatbot powered by Mistral 7B & LLaVA.
Chat with text, images, audio, and PDFs — all running on your machine.
| Capability | Description |
|---|---|
| 💬 Text Chat | Conversational AI with sliding window memory |
| 🖼️ Image Understanding | Upload images and ask questions — powered by LLaVA 1.5 + CLIP |
| 🎙️ Voice Input | Record mic audio or upload audio files, transcribed via Whisper |
| 📄 PDF Chat | Upload PDFs and chat with their content using RAG + ChromaDB |
| 🗂️ Session Management | Persistent SQLite-backed chat history with renaming & delete |
| 🎨 Premium UI | Modern tabbed sidebar with glassmorphism & icons |
LLaVa-Mistral-7b-Chatbot/
├── app.py # Streamlit UI & entry point
├── llm_chains.py # LLM chain builders (normal + PDF RAG)
├── INSTRUCTIONS.md # Detailed architecture & file guide 📖
├── prompt_templates.py # Mistral instruction-tuned prompt templates
├── image_handler.py # LLaVA multimodal image processing
├── audio_handler.py # Whisper audio transcription
├── pdf_handler.py # PDF parsing, chunking & vector DB ingestion
├── database_operations.py # SQLite session persistence
├── html_templates.py # Custom Streamlit HTML/CSS components
├── utils.py # Config loader & shared utilities
├── config.yaml # Central configuration file
└── requirements.txt
git clone https://github.com/YOUR_USERNAME/LLaVa-Mistral-7b-Chatbot.git
cd LLaVa-Mistral-7b-Chatbotpython -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activatepip install -r requirements.txtThe project expects GGUF model files at the paths defined in config.yaml:
ctransformers:
model_path:
small: "./models/mistral-7b-instruct-v0.1.Q3_K_M.gguf"
large: "./models/mistral-7b-instruct-v0.1.Q5_K_M.gguf"
llava_model:
llava_model_path: "./models/llava/ggml-model-q5_k.gguf"
clip_model_path: "./models/llava/Llama-3-Update-3.0-mmproj-model-f16.gguf"Create a models/ directory and place the files there, or update the paths in config.yaml to match where your models are stored.
streamlit run app.pyAll settings are in config.yaml:
| Key | Default | Description |
|---|---|---|
model_path.small |
Q3_K_M gguf | Lighter Mistral model |
model_path.large |
Q5_K_M gguf | Higher quality Mistral model |
max_new_tokens |
100 |
Max tokens per response |
temperature |
0.1 |
Sampling temperature |
context_length |
2048 |
Model context window |
gpu_layers |
0 |
Layers offloaded to GPU (0 = CPU only) |
threads |
-1 |
CPU threads (-1 = auto) |
chat_memory_length |
2 |
Past exchanges kept in memory |
number_of_retrieved_documents |
3 |
Chunks retrieved during PDF RAG |
chunk_size |
1024 |
PDF text chunk size (characters) |
overlap |
50 |
Chunk overlap |
whisper_model |
openai/whisper-small |
Whisper variant for transcription |
embeddings_path |
BAAI/bge-large-en-v1.5 |
Embedding model for vector search |
chromadb_path |
chroma_db |
Local ChromaDB storage path |
chat_sessions_database_path |
./chat_sessions/chat_sessions.db |
SQLite DB path |
streamlit
streamlit-mic-recorder
langchain
langchain-community
ctransformers
llama-cpp-python
transformers
librosa
pypdfium2
chromadb
sentence-transformers
InstructorEmbedding
pyyaml
You can download the GGUF models directly from Hugging Face:
- Mistral 7B Instruct: TheBloke/Mistral-7B-Instruct-v0.1-GGUF
- LLaVA 1.5 + CLIP: mys/GGML/llava-v1.5-7b
If your database fails to initialize or SQLite errors occur, ensure that the path defined in config.yaml exists, or delete chat_sessions/chat_sessions.db to let the app recreate it automatically.
To run model inference on GPU, adjust the gpu_layers parameter in config.yaml to offload layers to your GPU (e.g., CUDA or Apple Metal). Make sure you have installed llama-cpp-python compiled with GPU support.
- Fork the repository
- Create a feature branch:
git checkout -b feat/your-feature - Commit your changes:
git commit -m "feat: add your feature" - Push and open a Pull Request