Intelligent Image Recognition with OmniRL Vision-Language Understanding
visuAI is an advanced image recognition system that combines:
- Frontend: Angular app with TensorFlow.js (MobileNet) for fast classification
- Backend: FastAPI + OmniRL for intelligent descriptions and Q&A
- Install dependencies:
npm install- Run development server:
ng serveVisit http://localhost:4200
- Navigate to backend:
cd backend- Create virtual environment:
python -m venv venv
venv\Scripts\activate # Windows- Install dependencies:
pip install -r requirements.txt- Configure environment:
copy .env.example .env- Run server:
python main.pyAPI available at http://localhost:8000
Image Upload → MobileNet (Browser) → Predictions
↓
FastAPI Backend
↓
OmniRL Model
↓
Description + Q&A ← User
- Fast Classification: TensorFlow.js runs in browser (no server needed)
- Real-time Results: Instant prediction probabilities
- Modern UI: Angular Material design
- Smart Descriptions: Converts predictions to natural language
- Visual Q&A: Answer questions about images
- Caching: Fast responses for repeated queries
- Frontend: See Angular docs
- Backend API:
http://localhost:8000/docs(when running) - Implementation Plan: See project artifacts
- Frontend: Angular 18, TensorFlow.js, Material UI
- Backend: Python, FastAPI, PyTorch (OmniRL)
- ML Models: MobileNet (classification), Qwen2.5-VL-3B (VQA)
visuAI-tensorflow/
├── src/ # Angular frontend
│ ├── app/
│ │ ├── components/
│ │ └── services/
│ └── assets/
├── backend/ # FastAPI backend
│ ├── main.py
│ ├── models/
│ ├── services/
│ └── training/
└── ...
✅ Phase 1: Backend structure complete (mock mode)
⏳ Phase 2: OmniRL training in progress
⏳ Phase 3: Frontend integration
MIT License
Contributions welcome! This is an experimental project for vision-language integration.
