🚀 TrainFlowVision

TrainFlowVision is a full-stack active learning platform for building, reviewing, training, improving, and managing computer vision models.

It combines an Angular FE, FastAPI BE, PostgreSQL persistence, Docker-based development, and YOLO based object detection workflows. The platform is designed for real-world image annotation, model training, experiment tracking, model rollback, and future deployment to drone, cloud GPU, or edge AI devices.

TrainFlowVision started as an object detection training tool, but the long-term goal is to make it a general-purpose computer vision training platform where users can upload data, review predictions, correct labels, train models, compare results, restore older models, and continuously improve detection quality through active learning.

🎯 Why TrainFlowVision

Training a computer vision model is not only about running a training command.

A real ML workflow needs:

Clean dataset handling
Annotation review
Dataset versioning
Training history
Model metrics
Model registry
Model restore
User preferences
Compute visibility
A clear way to understand whether the model is improving or getting worse

TrainFlowVision focuses on the complete feedback loop:

Upload data
Review predictions
Fix annotations
Train or refine the model
Track metrics
Compare history
Restore older models
Improve through active learning

The goal is to move beyond a simple training script and build the system around the model.

✨ Core Features

Image upload and Label Studio YOLO ZIP import
Interactive annotation review dashboard
Editable bounding boxes for model-detected and manually created annotations
Move, resize, delete, and undo support for annotation boxes
Last selected class memory for faster labeling
Confidence display for model predictions
Class-based annotation colors
Consistent annotation styling between model and human-created boxes
User preferences for annotation and UI behavior
Active learning workflow for continuous model improvement
Local training with CPU or CUDA GPU support
Live compute status and training logs
PostgreSQL-backed dataset and training history
Neural History panel for training run tracking
Detailed training run modal with dataset, config, metrics, and model output
Model rollback and restore support
Swagger API documentation through FastAPI
Docker-based local development setup

🧠 Active Learning Workflow

TrainFlowVision is designed around human-in-the-loop model improvement.

The model can make predictions, but the user stays in control by reviewing and correcting detections before the data is used for future training.

The workflow supports:

Uploading new images
Importing labeled datasets from Label Studio
Reviewing model-detected boxes
Adding missing boxes manually
Correcting wrong boxes
Moving boxes
Resizing boxes
Removing false positives
Saving reviewed labels
Training or refining the model
Tracking whether model quality improves over time

This makes the platform useful for practical ML workflows where model predictions are helpful, but still need human correction.

🖼️ Annotation Review

The annotation review UI allows users to work with both model predictions and manual annotations.

Supported annotation actions:

Create new boxes
Move existing boxes
Resize boxes
Delete boxes
Change class labels
Undo recent changes with Ctrl + Z
View model confidence on detected boxes
Keep class colors consistent across manual and model-detected boxes
Show source indicators for human-created or model-detected annotations
Use visual editing states while moving or resizing boxes

This is important because real-world annotations are rarely perfect on the first try. The system is designed to help users quickly correct close detections instead of deleting and redrawing everything.

📊 Neural History

Neural History tracks training runs and helps users understand how the model changes over time.

The history panel is designed to show compact run information first, then provide deeper details when a user opens a run.

Useful details can include:

Training run status
Dataset version
Dataset source
Image count
Label count
Class information
Training configuration
Device used
Precision
Recall
mAP50
mAP50-95
Loss values
Best model path
Last model path
Active model status
Restore availability
Failed run errors when available

The goal is to help answer:

Which model is best?
Which dataset trained it?
What settings were used?
Did the model improve or get worse?
Can I safely restore an older model?

Neural History turns training into a traceable experiment workflow instead of a black box.

🏗️ Architecture Overview

TrainFlowVision is designed as a modular full-stack ML platform, not just a single training script.

The architecture separates FE, BE, database, ML orchestration, dataset management, model history, and future deployment concerns.

User
  |
  v
Angular FE
  |
  v
FastAPI BE
  |
  +--> PostgreSQL
  |      |
  |      +--> Projects
  |      +--> Dataset versions
  |      +--> Training runs
  |      +--> Model registry
  |      +--> Training metrics
  |      +--> User preferences
  |
  +--> File Storage
  |      |
  |      +--> Uploaded images
  |      +--> Label files
  |      +--> Dataset exports
  |      +--> Trained model files
  |
  +--> ML Orchestration Layer
         |
         +--> Prediction
         +--> Annotation review
         +--> Training
         +--> Evaluation
         +--> Model restore
         +--> Future deployment

The goal of this architecture is to keep the platform easy to extend. Training, prediction, dataset management, and model tracking are separated so the system can grow from a local development tool into a production-style ML platform.

🧩 FE Layer

The Angular FE provides the interactive product experience.

It includes:

Dataset upload flow
Label Studio ZIP import flow
Image review dashboard
Annotation editor
Bounding box editing
Class selection
Undo behavior
User preferences
Training controls
Compute status display
Live training logs
Neural History panel
Training run details modal
Model restore actions

The FE is designed to support fast human review. This matters because active learning depends on quick correction of model mistakes.

⚙️ BE Layer

The FastAPI BE acts as the central orchestration layer.

It handles:

Project APIs
Upload processing
Label Studio dataset parsing
Image and label validation
Dataset version creation
Training run creation
Training execution
Prediction execution
Model registry updates
Compute capability detection
Training history APIs
User preference APIs
Swagger API documentation

The BE is designed so ML workflows are not hidden inside one script. Each important action can be tracked, inspected, debugged, and later moved to a worker or cloud GPU service.

🗄️ Database Layer

PostgreSQL is used as the source of truth for platform state.

It can store:

Projects
Dataset versions
Image metadata
Label metadata
Training runs
Training metrics
Model registry
Model restore history
User preferences
Future audit logs
Future deployment records

This allows the system to answer important ML engineering questions:

Which dataset trained this model?
Which model is currently active?
What changed between two training runs?
Which run produced the best model?
Can an older model be restored safely?
Did the model improve or get worse?

🧠 ML Orchestration Layer

The ML layer currently focuses on YOLO based object detection, but the architecture is designed to support more than one model type over time.

Current ML capabilities include:

YOLO based object detection
Local training
Local prediction
CPU fallback
CUDA GPU support
Active learning review loop
Model restore support
Training metrics tracking

Future ML support can include:

Any YOLO detection model
YOLOv8, YOLOv9, YOLOv10, YOLOv11, or future YOLO versions
Oriented bounding box models
Segmentation models
Classification models
Custom PyTorch models
ONNX export
TensorRT optimization
Edge device deployment
Drone-ready inference pipelines

The goal is to keep the ML backend flexible so the platform is not locked to one specific model version.

📦 Dataset Versioning

TrainFlowVision is designed around the idea that dataset changes matter as much as code changes.

Every upload, review, correction, or Label Studio import can become part of a dataset version.

A dataset version can track:

Source type
Image count
Label count
Class names
Class distribution
Reviewed images
Skipped images
Rejected detections
Created date
Notes

This makes training reproducible and easier to debug.

📚 Model Registry

The model registry connects a trained model file to the training run that created it.

A model record can include:

Model path
Dataset version
Training run id
Task type
Model type
Active model flag
Metric summary
Created date
Restore status

This is important because in real ML systems, the model file alone is not enough. The system also needs to know the dataset, settings, metrics, and history behind that model.

🌍 Real-World Use Cases

TrainFlowVision is designed for practical computer vision workflows such as:

Drone-based garbage detection near water bodies
Pothole and road damage detection
Plant and flower detection from top-view images
Field inspection and infrastructure monitoring
Custom object detection projects
Human-in-the-loop active learning workflows
Edge AI and drone-ready model experimentation
Construction site inspection
Agriculture monitoring
Environmental cleanup detection
Industrial visual inspection

🛠️ Tech Stack

FE

Angular
TypeScript
SCSS
Interactive annotation UI

BE

FastAPI
Python
SQLAlchemy
PostgreSQL
Docker

ML

YOLO based object detection
PyTorch
Ultralytics
CUDA GPU support
Active learning workflow

Development

Docker Compose
Swagger API docs
Local GPU or CPU execution
PostgreSQL through Docker

⚡ Quick Start

Windows

Double-click:

start_dev.bat

Or run:

start_dev.bat

Linux or Mac

chmod +x start_dev.sh
./start_dev.sh

Manual Development Setup

See:

QUICKSTART.md

🌐 Access Points

Frontend:

http://localhost:4200

Backend API:

http://localhost:8000

Swagger API Docs:

http://localhost:8000/docs

📚 Documentation

Quick Start Guide: QUICKSTART.md
Development Walkthrough: walkthrough.md

Note:

Do not link to local machine paths in a public README.

Use a project-relative file path instead.

Example:

walkthrough.md

Avoid:

C:\Users\admin\...

📝 Requirements

Python 3.11
Node.js 18 or higher
Docker and Docker Compose
PostgreSQL through Docker
NVIDIA GPU with CUDA support, optional
CPU fallback when CUDA is not available

🧪 Current Project Direction

TrainFlowVision is currently focused on building a reliable end-to-end active learning workflow.

The current engineering focus is:

Better annotation review experience
Reliable bounding box editing
Undo support for annotation actions
Consistent class color behavior
User preferences
Persistent training history
Training run details modal
Database-backed model tracking
CUDA and CPU compute visibility
Stable Label Studio ZIP import
Dataset and training flow consistency
Model restore support

The project is moving toward a more complete ML platform where every model improvement can be tracked, explained, reproduced, and restored.

🔮 Future Platform Roadmap

TrainFlowVision is being designed to grow into a general-purpose computer vision training and deployment platform.

🧠 Model Support

Future model support can include:

Multiple YOLO versions
Custom YOLO weights
Object detection
Oriented bounding boxes
Image segmentation
Image classification
Custom PyTorch models
ONNX export
TensorRT optimized models
Edge-ready model packaging

📊 Experiment Tracking

Future experiment tracking can include:

Training run comparison
Metric trend charts
Per-class precision and recall
Class imbalance warnings
Dataset quality score
Best model recommendation
Model regression detection
Training configuration diff
Previous model vs new model comparison

🗃️ Dataset Management

Future dataset tools can include:

Dataset version diff
Image duplicate detection
Label quality validation
Class distribution charts
Train, validation, test split management
Dataset export
Dataset rollback
Bad image detection
Low-confidence sample queue
Active learning sample prioritization

🖼️ Annotation and Review

Future annotation features can include:

Video frame annotation
Smart object selection
Color-based object region selection
Polygon annotation
Segmentation masks
Bulk class update
Keyboard-first review flow
Review assignment
Annotation audit trail
Human vs model source tracking

⚡ Training and Compute

Future training features can include:

Background training queue
Cloud GPU training
Training cancellation
Training pause and resume
Multi-GPU support
Scheduled training
Training presets
Hyperparameter management
Automatic best model selection
Training cost tracking

🚁 Drone and Edge AI

Future drone and edge workflows can include:

Drone image sequence support
Top-view detection optimization
Offline inference package
Jetson deployment
ONNX or TensorRT export
Real-time inference API
Video stream inference
Field inspection dashboard
GPS metadata support
Detection heatmaps

👥 Multi-User Platform

Future platform features can include:

User authentication
Role-based access
Multi-project support
Team review workflow
User preferences per account
Project-level settings
Dataset ownership
Model approval flow
Audit logs

🚀 Future Deployment Architecture

The current system is local-first, but it is being designed with future deployment in mind.

A future production-style architecture could include:

Angular FE
  |
FastAPI API Gateway
  |
  +--> PostgreSQL
  +--> Object Storage
  +--> Redis Queue
  +--> Training Worker
  +--> Inference Worker
  +--> Model Registry
  +--> Cloud GPU Provider
  +--> Edge Device Export

Future deployment options could include:

Local GPU workstation
Cloud GPU training
RunPod or Vast.ai training workers
AWS, Azure, or Google Cloud GPU instances
Dockerized inference service
Jetson or edge device deployment
Drone-mounted inference module
Remote training queue
Background job processing
Multi-user project management

The platform is being built step by step so it can grow from a local development project into a serious ML product architecture.

💡 Engineering Value

TrainFlowVision demonstrates the ability to build an applied AI product from end to end.

It shows experience across:

Full-stack architecture
Angular FE development
FastAPI BE development
PostgreSQL data modeling
Docker-based development
ML workflow orchestration
YOLO based computer vision
Human-in-the-loop annotation review
Active learning workflow design
GPU-aware training infrastructure
Experiment tracking
Model registry design
Future cloud and edge deployment planning

The project is not only about training a model. It is about building the system around the model.

That system includes data, review, training, metrics, history, restore, preferences, and future deployment. This is the difference between a small ML demo and a real ML engineering platform.

📌 Project Summary

TrainFlowVision is a practical full-stack ML engineering project focused on real-world computer vision workflows.

It connects:

Data upload
Annotation review
Model prediction
Human correction
Training
Metrics
History
Model restore
Future deployment

The system is built to grow from a local active learning platform into a future-ready computer vision product for drone, edge, and cloud-based ML workflows.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
BE		BE
FE		FE
docs		docs
ml		ml
scripts		scripts
static		static
.dockerignore		.dockerignore
.env		.env
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile.backend		Dockerfile.backend
Dockerfile.frontend		Dockerfile.frontend
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
backfill_forks.py		backfill_forks.py
backfill_forks_from_runs.py		backfill_forks_from_runs.py
deduplicate_forks.py		deduplicate_forks.py
deduplicate_forks2.py		deduplicate_forks2.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
run_dev.py		run_dev.py
setup_backend.sh		setup_backend.sh
start_dev.bat		start_dev.bat
start_dev.sh		start_dev.sh
test.py		test.py
test_api.py		test_api.py
test_api2.py		test_api2.py
test_resolve.py		test_resolve.py

Folders and files

Latest commit

History

Repository files navigation

🚀 TrainFlowVision

🎯 Why TrainFlowVision

✨ Core Features

🧠 Active Learning Workflow

🖼️ Annotation Review

📊 Neural History

🏗️ Architecture Overview

🧩 FE Layer

⚙️ BE Layer

🗄️ Database Layer

🧠 ML Orchestration Layer

📦 Dataset Versioning

📚 Model Registry

🌍 Real-World Use Cases

🛠️ Tech Stack

FE

BE

ML

Development

⚡ Quick Start

Windows

Linux or Mac

Manual Development Setup

🌐 Access Points

📚 Documentation

📝 Requirements

🧪 Current Project Direction

🔮 Future Platform Roadmap

🧠 Model Support

📊 Experiment Tracking

🗃️ Dataset Management

🖼️ Annotation and Review

⚡ Training and Compute

🚁 Drone and Edge AI

👥 Multi-User Platform

🚀 Future Deployment Architecture

💡 Engineering Value

📌 Project Summary

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages