Enterprise-grade deepfake detection across image, video, and audio.
DeepSafe is a modular platform that combines multiple state-of-the-art deepfake detection models into a single ensemble system. Each model runs in its own Docker container. A central API gateway orchestrates requests, dispatches them to model services, and fuses results using voting, averaging, or a trained stacking meta-learner.
Add a new model in minutes, retrain the ensemble in one command.
graph TD
Browser["Browser :8888"] --> Nginx["Nginx + React SPA"]
Nginx -->|"/api/*"| Gateway["FastAPI Gateway :8000"]
Gateway --> NPR["NPR :5001<br/>(Image)"]
Gateway --> UFD["UniversalFakeDetect :5004<br/>(Image)"]
Gateway --> CEV["CrossEfficientViT :7001<br/>(Video)"]
NPR --> Ensemble["Meta-Learner<br/>(Stacking / Voting / Average)"]
UFD --> Ensemble
CEV --> Ensemble
Ensemble --> Verdict["Verdict + Confidence"]
Every model container exposes two endpoints: GET /health and POST /predict. The gateway reads model registrations from a single config file and handles the rest.
git clone https://github.com/siddharthksah/DeepSafe.git
cd DeepSafe
make start| URL | Description |
|---|---|
| http://localhost:8888 | Web dashboard |
| http://localhost:8000/docs | API documentation (Swagger) |
| Model | Type | Port | Description | Source |
|---|---|---|---|---|
| NPR Deepfake | Image | 5001 | Neural Pattern Recognition for subtle artifact detection | Paper / GitHub |
| UniversalFakeDetect | Image | 5004 | CLIP-based detector that generalizes across generators | Paper / GitHub |
| Cross-Efficient ViT | Video | 7001 | EfficientNet + Vision Transformer for video analysis | Paper / GitHub |
All model weights are mirrored on HuggingFace for resilience.
The API gateway supports three fusion strategies:
| Method | How it works |
|---|---|
| Voting | Majority vote across model predictions |
| Average | Mean of model probability scores |
| Stacking | Trained meta-learner (LightGBM/XGBoost/etc.) on model outputs |
Stacking uses pre-trained artifacts in api/meta_model_artifacts/. Retrain anytime with make retrain MEDIA_TYPE=image.
DeepSafe ships with an SDK that handles HTTP serving, health checks, lazy loading, and thread safety. You write only the inference logic.
# Register model (updates config + compose + creates scaffold)
make add-model NAME=my_detector MEDIA_TYPE=image PORT=5008
# Implement your logic in models/image/my_detector/detector.py
# Build, start, and retrain ensemble
make start
make retrain MEDIA_TYPE=image# models/image/my_detector/detector.py
import torch
from deepsafe_sdk import ImageModel, PredictionResult
class MyDetector(ImageModel):
def load(self):
self.model = torch.load(self.weights_path("weights/model.pth"))
self.model.eval()
def predict(self, input_data: str, threshold: float) -> PredictionResult:
image = self.decode_image(input_data)
prob = self.run_inference(image)
return self.make_result(probability=prob, threshold=threshold)See docs/adding-a-model.md for the full guide.
After adding or removing models, retrain the stacking meta-learner:
# Full pipeline: run inference on benchmark dataset + train meta-learner
make retrain MEDIA_TYPE=image
# With Optuna hyperparameter search
make retrain MEDIA_TYPE=image OPTIMIZER=optuna TRIALS=100
# Re-train from existing features (skip inference)
make eval MEDIA_TYPE=imageThe pipeline health-checks all active models, generates a feature matrix from the benchmark dataset, trains 8 classifiers (LogReg, RF, GBM, SVM, KNN, NB, XGBoost, LightGBM), and deploys the best performer.
A balanced multi-modal benchmark is available on HuggingFace:
| Modality | Real | Fake | Total | Generators |
|---|---|---|---|---|
| Images | 2,000 | 2,000 | 4,000 | DALL-E 2/3, Midjourney v5/6/7, Stable Diffusion 1.x/2/3/XL, Flux, GPT Image, Grok, Imagen 3/4, and 20+ more |
| Audio | 1,000 | 1,000 | 2,000 | HiFiGAN, MelGAN, WaveGlow, Tacotron, ASVspoof attacks, Neural Codec, and 20+ more |
| Video | 100 | 100 | 200 | Sora, Gen-2, Moonvalley, LaVie, ModelScope, LAVDF manipulations, and 10+ more |
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"media_type": "image", "image_data": "<base64>", "ensemble_method": "voting"}'curl -X POST http://localhost:8000/detect \
-F "file=@photo.jpg" \
-F "ensemble_method=stacking"{
"verdict": "fake",
"confidence_in_verdict": 0.92,
"ensemble_score_is_fake": 0.92,
"ensemble_method_used": "stacking",
"model_results": {
"npr_deepfakedetection": {"probability": 1.0, "class": "fake"},
"universalfakedetect": {"probability": 0.85, "class": "fake"}
}
}# Register
curl -X POST http://localhost:8000/register -d "username=user&password=pass"
# Login (returns JWT)
curl -X POST http://localhost:8000/token -d "username=user&password=pass"
# Use token for protected endpoints (/history, /users/me)
curl -H "Authorization: Bearer <token>" http://localhost:8000/historymake help # Show all commands
make start # Start all services
make stop # Stop all services
make health # Check model service health
make test # Run system tests
make lint # Run linters (black + flake8)
make add-model # Register a new model
make retrain # Retrain ensemble meta-learner
make eval # Re-train from existing features
make clean # Remove containers and cachesDeepSafe/
├── api/ # FastAPI gateway
│ ├── main.py # API routes, ensemble logic, auth
│ ├── database.py # SQLAlchemy models (analysis history)
│ └── meta_model_artifacts/ # Deployed meta-learner weights
├── sdk/ # Model SDK (shared base classes)
│ └── deepsafe_sdk/ # ImageModel, VideoModel, AudioModel
├── models/
│ ├── image/
│ │ ├── npr_deepfakedetection/
│ │ └── universalfakedetect/
│ └── video/
│ └── cross_efficient_vit/
├── frontend/ # React + Tailwind dashboard
├── config/
│ └── deepsafe_config.json # Model registry
├── scripts/
│ ├── add_model.py # Model integration CLI
│ ├── retrain_pipeline.py # Ensemble retraining pipeline
│ └── health_check.py # Service health checker
├── meta_feature_generator.py # Meta-feature dataset builder
├── train_meta_learner_advanced.py # Meta-learner training suite
├── docker-compose.yml
└── Makefile
We welcome contributions. See CONTRIBUTING.md and Code of Conduct.
The easiest way to contribute is adding a new detection model. See docs/adding-a-model.md.
MIT License. See LICENSE.
DeepSafe integrates the following open-source research:
- NPR Deepfake: Chuangchuang Tan et al. (GitHub)
- UniversalFakeDetect: Utkarsh Ojha, Yuheng Li, Yong Jae Lee (GitHub)
- Cross-Efficient ViT: Davide Coccomini et al. (GitHub)
- CLIP: Alec Radford et al., OpenAI (GitHub)
@misc{deepsafe,
author = {Siddharth Kumar},
title = {DeepSafe: Enterprise-Grade Deepfake Detection Platform},
year = {2025},
publisher = {GitHub},
url = {https://github.com/siddharthksah/DeepSafe}
}


