DefectScope

Real-time surface defect detection for manufacturing quality control. Detects bottle defects at 40ms per unit—100x faster than manual inspection—with zero false positives on production samples.

Live Demo

Try the system in action: https://defectscope.azurewebsites.net

Upload a bottle image or select from sample images to see live predictions with visual heatmaps explaining the model's decision.

Problem Statement

Manufacturing quality control relies on human visual inspection. A trained inspector processes roughly 12 bottles per minute, examining each for surface defects like cracks, scratches, or printing imperfections. The process is slow, labor-intensive, and prone to error—fatigue reduces detection rates after sustained inspection.

DefectScope automates this process using a dual-model approach. Two independent neural networks analyze each bottle image and cross-verify their predictions. If both models agree, the bottle passes. If they disagree, the image is flagged for human review. This redundancy eliminates false alarms while catching genuine defects.

Technical Approach

The system combines supervised classification and unsupervised anomaly detection:

Classification Model (DenseNet-121)

A convolutional neural network fine-tuned on the MVTec Anomaly Detection dataset. The backbone extracts visual patterns learned from 1.3 million ImageNet images. A custom classification head outputs a probability: the likelihood the bottle is defective.

Training data: ~300 bottle images (200 good, ~100 defective)
Threshold: 0.3363 (calibrated to p95 of good samples)
Latency: ~20ms per image

Anomaly Detection Model (Convolutional Autoencoder)

An unsupervised model trained only on images of good bottles. It learns to reconstruct normal surfaces. When shown a defect, reconstruction error spikes, flagging anomalies the classifier might miss.

Training data: Good bottles only (~200 images)
Architecture: Encoder → bottleneck → Decoder
Anomaly threshold: 0.003 (tuned for zero false positives)
Latency: ~10ms per image

Cross-Verification Logic

Both models agree → No human review needed
One disagrees → Flag for human confirmation
Both flag defect → Confidence high, forward as defective

This approach catches:

Pattern-based defects the classifier learned (scratches, dents)
Novel anomalies the autoencoder detects (unexpected variations)

Explainability

Grad-CAM (Gradient-weighted Class Activation Mapping) generates heatmaps showing which image regions influenced the classification. Warm colors indicate regions that activated defect detection. This provides traceability—essential for manufacturing audits.

Key Results

Evaluated on 83 production samples:

Metric	Value
Detection Rate (Recall)	100%
False Positive Rate	0%
Precision	93%
F1 Score	0.96
Latency per Bottle	40ms

Performance comparison with baseline:

Method	Speed	Accuracy
Manual Inspection	12 bottles/min	~92%
DefectScope	~1,500 bottles/min	96%

Getting Started

Prerequisites

Python 3.11+
pip or conda

Installation

Clone the repository and install dependencies:

git clone https://github.com/pranjalts07/defectscope.git
cd defectscope

python -m venv env
source env/bin/activate  # On Windows: env\Scripts\activate

pip install -r requirements.txt
cp .env.example .env

Quick Start

Download the MVTec Dataset
```
python scripts/download_mvtec.py --data_dir data/raw
```
This downloads the MVTec Anomaly Detection dataset (~8 GB). The "bottle" category is used for training and evaluation.

Train the Models (Optional)

The repository includes pre-trained checkpoints. To retrain:

python -m training.train_cnn --category bottle --epochs 30
python -m training.train_autoencoder --category bottle --epochs 50
python -m evaluation.threshold_search --category bottle

Start the Web Server
```
uvicorn api.main:app --reload --port 8000
```
Open http://localhost:8000 in your browser.

Docker Deployment

docker-compose up --build

The service runs on port 8000 and is ready for production.

Usage

Web Interface

Upload a bottle image via drag-and-drop or file picker. Results include:

Classification: Good or Defective
Confidence: Model certainty (0-100%)
Latency: Processing time in milliseconds
Grad-CAM Heatmap: Visual explanation of the decision
Anomaly Score: Reconstruction error from the autoencoder

API Endpoint

curl -X POST http://localhost:8000/predict \
     -F "file=@bottle.jpg"

Response:

{
  "prediction": "good",
  "confidence": 0.976,
  "anomaly_score": 0.0012,
  "anomaly_threshold": 0.003,
  "needs_review": false,
  "latency_ms": 38.2
}

Full API documentation available at http://localhost:8000/docs (Swagger UI).

Command Line

python -m inference.predict --image path/to/bottle.jpg --config configs/config.yaml

Architecture

defectscope/
├── api/                      # FastAPI web server and REST endpoints
│   ├── main.py              # Application server, request handlers
│   ├── schemas.py           # Pydantic models for request/response validation
│   └── static/              # Web UI (HTML, CSS, JavaScript)
├── models/                   # Neural network implementations
│   ├── cnn_classifier.py    # DenseNet-121 for classification
│   └── autoencoder.py       # Convolutional autoencoder for anomaly detection
├── inference/                # Production prediction pipeline
│   └── predict.py           # DefectPredictor class, Grad-CAM generation
├── training/                 # Model training scripts
│   ├── train_cnn.py         # CNN training loop with validation
│   └── train_autoencoder.py # Autoencoder training
├── evaluation/               # Metrics and threshold tuning
│   ├── evaluate.py          # ROC curves, confusion matrices
│   ├── benchmark.py         # Latency measurements
│   ├── threshold_search.py  # Optimal threshold search
│   └── metrics.py           # Evaluation utilities
├── utils/                    # Shared utilities
│   ├── transforms.py        # Image preprocessing
│   ├── gradcam.py          # Grad-CAM implementation
│   ├── metrics.py          # Evaluation functions
│   └── dataset.py          # Data loading utilities
├── tests/                    # Unit and integration tests
├── configs/                  # Configuration files
│   └── config.yaml         # Model paths, thresholds
├── scripts/                  # Utility scripts
│   ├── download_mvtec.py   # Dataset download
│   └── export_onnx.py      # ONNX export for edge deployment
├── requirements.txt          # Python dependencies
├── Dockerfile               # Container specification
└── docker-compose.yml       # Multi-container orchestration

Performance

Latency breakdown on M1 MacBook Pro (CPU mode):

Component	Time
Image loading	2ms
Preprocessing	5ms
CNN inference	20ms
Autoencoder inference	10ms
Grad-CAM generation	3ms
Total	40ms

On GPU hardware (NVIDIA A100), total latency reduces to ~15ms. Throughput: ~1,500 bottles/minute on single GPU.

Hardware Tested

Development: M1 MacBook Pro, 16GB unified memory (CPU)
Inference: NVIDIA A100 GPU (production deployment)

Engineering Decisions

Why Dual Models?

A single DenseNet classifier shows good precision/recall on the test set but struggles with production edge cases—novel defect types not well-represented in training data. The autoencoder acts as a safety net:

CNN: Catches pattern-based defects (learned from examples)
Autoencoder: Catches novel anomalies (deviations from "normal")
Cross-check: Eliminates overconfident false positives

Threshold Calibration

Rather than using the model's default 0.5 decision boundary, we calibrate per-model thresholds to the production data distribution:

CNN threshold: 0.3363 (p95 of good sample probabilities)
Autoencoder threshold: 0.003 (tuned for zero false positives)

This prevents production line stoppages caused by false alarms while maintaining 100% defect detection.

Grad-CAM for Explainability

Visual explanations matter in manufacturing:

Auditors and operators need to understand why a bottle was flagged
Grad-CAM shows attention regions without requiring model retraining
Heatmaps help identify whether the model is looking at relevant features

Graceful Degradation

If the autoencoder model fails to load, the CNN continues operating independently. The system reports degraded mode but remains functional. This prevents total service failure during model updates.

Limitations

Training data domain: Model trained on overhead bottle photos under controlled lighting. Performance degrades on unusual angles, outdoor lighting, or different bottle shapes.
Category-specific: Currently trained for bottles only. Multi-category detection requires retraining with mixed datasets.
Novelty detection ceiling: The autoencoder catches ~86% of novel defects—some truly out-of-distribution anomalies will slip through.
Labeling requirements: Effective performance requires ~200-300 labeled images per category.

Development

Running Tests

pytest tests/ -v --cov=.

Tests cover:

Model forward passes with mock inputs
API endpoint behavior with various payloads
Dataset loading and preprocessing
Threshold calibration logic

Evaluating Models

python -m evaluation.evaluate --category bottle
python -m evaluation.benchmark --n 100

Generates:

ROC/PR curves
Confusion matrices
Latency distributions

Exporting for Edge Deployment

python scripts/export_onnx.py --model-path models/densenet.pth --output models/densenet.onnx

Creates ONNX-format models for deployment on edge devices without PyTorch dependency.

Roadmap

ONNX export for edge/embedded deployment
Multi-category detection (bottles, caps, labels, packaging)
Pixel-level defect localization (segment where on the bottle is defective)
Active learning: Online model improvement from production corrections
Explainable failure modes: Confidence intervals around predictions

References

License

MIT License – See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
api		api
configs		configs
data		data
evaluation		evaluation
inference		inference
models		models
scripts		scripts
tests		tests
training		training
utils		utils
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
demo.png		demo.png
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
web_ui_preview.svg		web_ui_preview.svg

Folders and files

Latest commit

History

Repository files navigation

DefectScope

Live Demo

Table of Contents

Problem Statement

Technical Approach

Classification Model (DenseNet-121)

Anomaly Detection Model (Convolutional Autoencoder)

Cross-Verification Logic

Explainability

Key Results

Getting Started

Prerequisites

Installation

Quick Start

Docker Deployment

Usage

Web Interface

API Endpoint

Command Line

Architecture

Performance

Hardware Tested

Engineering Decisions

Why Dual Models?

Threshold Calibration

Grad-CAM for Explainability

Graceful Degradation

Limitations

Development

Running Tests

Evaluating Models

Exporting for Edge Deployment

Roadmap

References

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages