DeepSafe

Enterprise-grade deepfake detection across image, video, and audio.

DeepSafe is a modular platform that combines multiple state-of-the-art deepfake detection models into a single ensemble system. Each model runs in its own Docker container. A central API gateway orchestrates requests, dispatches them to model services, and fuses results using voting, averaging, or a trained stacking meta-learner.

Add a new model in minutes, retrain the ensemble in one command.

Architecture

graph TD
    Browser["Browser :8888"] --> Nginx["Nginx + React SPA"]
    Nginx -->|"/api/*"| Gateway["FastAPI Gateway :8000"]

    Gateway --> NPR["NPR :5001<br/>(Image)"]
    Gateway --> UFD["UniversalFakeDetect :5004<br/>(Image)"]
    Gateway --> CEV["CrossEfficientViT :7001<br/>(Video)"]

    NPR --> Ensemble["Meta-Learner<br/>(Stacking / Voting / Average)"]
    UFD --> Ensemble
    CEV --> Ensemble

    Ensemble --> Verdict["Verdict + Confidence"]

Every model container exposes two endpoints: GET /health and POST /predict. The gateway reads model registrations from a single config file and handles the rest.

Quick Start

git clone https://github.com/siddharthksah/DeepSafe.git
cd DeepSafe
make start

URL	Description
http://localhost:8888	Web dashboard
http://localhost:8000/docs	API documentation (Swagger)

Active Models

Model	Type	Port	Description	Source
NPR Deepfake	Image	5001	Neural Pattern Recognition for subtle artifact detection	Paper / GitHub
UniversalFakeDetect	Image	5004	CLIP-based detector that generalizes across generators	Paper / GitHub
Cross-Efficient ViT	Video	7001	EfficientNet + Vision Transformer for video analysis	Paper / GitHub

All model weights are mirrored on HuggingFace for resilience.

Ensemble Methods

The API gateway supports three fusion strategies:

Method	How it works
Voting	Majority vote across model predictions
Average	Mean of model probability scores
Stacking	Trained meta-learner (LightGBM/XGBoost/etc.) on model outputs

Stacking uses pre-trained artifacts in api/meta_model_artifacts/. Retrain anytime with make retrain MEDIA_TYPE=image.

Adding a New Model

DeepSafe ships with an SDK that handles HTTP serving, health checks, lazy loading, and thread safety. You write only the inference logic.

Automated

# Register model (updates config + compose + creates scaffold)
make add-model NAME=my_detector MEDIA_TYPE=image PORT=5008

# Implement your logic in models/image/my_detector/detector.py

# Build, start, and retrain ensemble
make start
make retrain MEDIA_TYPE=image

What you write

# models/image/my_detector/detector.py
import torch
from deepsafe_sdk import ImageModel, PredictionResult

class MyDetector(ImageModel):
    def load(self):
        self.model = torch.load(self.weights_path("weights/model.pth"))
        self.model.eval()

    def predict(self, input_data: str, threshold: float) -> PredictionResult:
        image = self.decode_image(input_data)
        prob = self.run_inference(image)
        return self.make_result(probability=prob, threshold=threshold)

See docs/adding-a-model.md for the full guide.

Retraining the Ensemble

After adding or removing models, retrain the stacking meta-learner:

# Full pipeline: run inference on benchmark dataset + train meta-learner
make retrain MEDIA_TYPE=image

# With Optuna hyperparameter search
make retrain MEDIA_TYPE=image OPTIMIZER=optuna TRIALS=100

# Re-train from existing features (skip inference)
make eval MEDIA_TYPE=image

The pipeline health-checks all active models, generates a feature matrix from the benchmark dataset, trains 8 classifiers (LogReg, RF, GBM, SVM, KNN, NB, XGBoost, LightGBM), and deploys the best performer.

Benchmark Dataset

A balanced multi-modal benchmark is available on HuggingFace:

Modality	Real	Fake	Total	Generators
Images	2,000	2,000	4,000	DALL-E 2/3, Midjourney v5/6/7, Stable Diffusion 1.x/2/3/XL, Flux, GPT Image, Grok, Imagen 3/4, and 20+ more
Audio	1,000	1,000	2,000	HiFiGAN, MelGAN, WaveGlow, Tacotron, ASVspoof attacks, Neural Codec, and 20+ more
Video	100	100	200	Sora, Gen-2, Moonvalley, LaVie, ModelScope, LAVDF manipulations, and 10+ more

API Reference

Predict (JSON)

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{"media_type": "image", "image_data": "<base64>", "ensemble_method": "voting"}'

Detect (File Upload)

curl -X POST http://localhost:8000/detect \
  -F "file=@photo.jpg" \
  -F "ensemble_method=stacking"

Response

{
  "verdict": "fake",
  "confidence_in_verdict": 0.92,
  "ensemble_score_is_fake": 0.92,
  "ensemble_method_used": "stacking",
  "model_results": {
    "npr_deepfakedetection": {"probability": 1.0, "class": "fake"},
    "universalfakedetect": {"probability": 0.85, "class": "fake"}
  }
}

Auth

# Register
curl -X POST http://localhost:8000/register -d "username=user&password=pass"

# Login (returns JWT)
curl -X POST http://localhost:8000/token -d "username=user&password=pass"

# Use token for protected endpoints (/history, /users/me)
curl -H "Authorization: Bearer <token>" http://localhost:8000/history

Commands

make help          # Show all commands
make start         # Start all services
make stop          # Stop all services
make health        # Check model service health
make test          # Run system tests
make lint          # Run linters (black + flake8)
make add-model     # Register a new model
make retrain       # Retrain ensemble meta-learner
make eval          # Re-train from existing features
make clean         # Remove containers and caches

Project Structure

DeepSafe/
├── api/                        # FastAPI gateway
│   ├── main.py                 # API routes, ensemble logic, auth
│   ├── database.py             # SQLAlchemy models (analysis history)
│   └── meta_model_artifacts/   # Deployed meta-learner weights
├── sdk/                        # Model SDK (shared base classes)
│   └── deepsafe_sdk/           # ImageModel, VideoModel, AudioModel
├── models/
│   ├── image/
│   │   ├── npr_deepfakedetection/
│   │   └── universalfakedetect/
│   └── video/
│       └── cross_efficient_vit/
├── frontend/                   # React + Tailwind dashboard
├── config/
│   └── deepsafe_config.json    # Model registry
├── scripts/
│   ├── add_model.py            # Model integration CLI
│   ├── retrain_pipeline.py     # Ensemble retraining pipeline
│   └── health_check.py         # Service health checker
├── meta_feature_generator.py   # Meta-feature dataset builder
├── train_meta_learner_advanced.py  # Meta-learner training suite
├── docker-compose.yml
└── Makefile

UI Preview

Contributing

We welcome contributions. See CONTRIBUTING.md and Code of Conduct.

The easiest way to contribute is adding a new detection model. See docs/adding-a-model.md.

License

MIT License. See LICENSE.

Credits

DeepSafe integrates the following open-source research:

NPR Deepfake: Chuangchuang Tan et al. (GitHub)
UniversalFakeDetect: Utkarsh Ojha, Yuheng Li, Yong Jae Lee (GitHub)
Cross-Efficient ViT: Davide Coccomini et al. (GitHub)
CLIP: Alec Radford et al., OpenAI (GitHub)

Citation

@misc{deepsafe,
  author = {Siddharth Kumar},
  title = {DeepSafe: Enterprise-Grade Deepfake Detection Platform},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/siddharthksah/DeepSafe}
}

Name		Name	Last commit message	Last commit date
Latest commit History 130 Commits
.github		.github
api		api
assets		assets
config		config
deepsafe_utils		deepsafe_utils
docs		docs
frontend		frontend
meta_learning_data		meta_learning_data
meta_learning_experiments		meta_learning_experiments
models		models
scripts		scripts
sdk		sdk
test_samples		test_samples
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
create_dataset.py		create_dataset.py
deepsafe_test.py		deepsafe_test.py
docker-compose.yml		docker-compose.yml
health_check_output.txt		health_check_output.txt
meta_feature_generator.py		meta_feature_generator.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
test_system.py		test_system.py
train_meta_learner_advanced.py		train_meta_learner_advanced.py
verify_config_load.py		verify_config_load.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSafe

Architecture

Quick Start

Active Models

Ensemble Methods

Adding a New Model

Automated

What you write

Retraining the Ensemble

Benchmark Dataset

API Reference

Predict (JSON)

Detect (File Upload)

Response

Auth

Commands

Project Structure

UI Preview

Contributing

License

Credits

Citation

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DeepSafe

Architecture

Quick Start

Active Models

Ensemble Methods

Adding a New Model

Automated

What you write

Retraining the Ensemble

Benchmark Dataset

API Reference

Predict (JSON)

Detect (File Upload)

Response

Auth

Commands

Project Structure

UI Preview

Contributing

License

Credits

Citation

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages