🎙️ Audio Call Analyser — Multilingual Voice Deepfake Detector

Real-time AI-powered voice authenticity verification using a fusion deep learning architecture

📌 Problem Statement

With the rise of AI-generated voice clones and deepfake audio, phone scams are becoming increasingly sophisticated. Traditional fraud detection systems fail to distinguish between genuine human voices and synthetic speech. Audio Call Analyser addresses this by providing a production-ready API that classifies audio as HUMAN or AI_GENERATED with high confidence across multiple languages.

🧠 Model Architecture

The system uses a Multi-Path Fusion Architecture that combines three complementary feature extraction pipelines for robust detection:

graph LR
    A[Raw Audio Input] --> B[Preprocessing<br/>Resampling + Normalization]
    B --> C1[Mel-Spectrogram<br/>128 Mels]
    B --> C2[Wav2Vec 2.0<br/>SSL Embeddings]
    B --> C3[Acoustic Features<br/>ZCR + RMS + Flatness]
    C1 --> D[1D-CNN Encoder]
    C2 --> E[768-dim Vector]
    C3 --> F[Feature Vector]
    D --> G[Concatenation Layer]
    E --> G
    F --> G
    G --> H[Dense + Dropout]
    H --> I{HUMAN or<br/>AI_GENERATED}

Pipeline	Technique	Output
Time-Frequency	Mel-spectrogram (128 mels)	Frequency-domain patterns
Self-Supervised	`facebook/wav2vec2-base` embeddings	768-dim contextual representations
Acoustic	ZCR, RMS Energy, Spectral Flatness	Low-level signal characteristics

✨ Key Features

🌐 Multilingual Support — English, Hindi, Tamil, Telugu, Malayalam
🧬 Fusion Model — Combines spectral, SSL, and acoustic pipelines for higher accuracy
⚡ CPU-Optimized — No GPU required; runs on free-tier cloud instances
🐳 Docker Ready — One-command deployment with containerization
🔌 REST API — Simple JSON-based request/response for easy integration
🔒 API Key Auth — Built-in x-api-key header authentication

🚀 Quick Start

Local Setup

# 1. Clone the repository
git clone https://github.com/coolss21/audio_call_analyser.git
cd audio_call_analyser

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Start the API server
python app.py
# Server starts at http://localhost:5000

🐳 Docker Setup

docker build -t audio-call-analyser .
docker run -p 5000:5000 audio-call-analyser

🔌 API Reference

`POST /detect` or `POST /api/voice-detection`

Headers:

Content-Type: application/json
x-api-key: secret123

Request:

{
  "language": "English",
  "audioFormat": "mp3",
  "audioBase64": "<base64_encoded_audio>"
}

Response:

{
  "status": "success",
  "classification": "HUMAN",
  "confidenceScore": 0.8524
}

`GET /health`

Returns API health status and model readiness.

📂 Project Structure

audio_call_analyser/
├── app.py                          # Flask REST API server
├── deepfake_detector.py            # Core fusion model & inference logic
├── deepfake_model_multilingual.pt  # Pre-trained model weights
├── src/
│   ├── config.py                   # Model & API configuration
│   ├── detector.py                 # Detection pipeline orchestrator
│   ├── models/                     # Model architecture definitions
│   └── processors/                 # Audio preprocessing utilities
├── requirements.txt                # Python dependencies
├── Dockerfile                      # Container configuration
├── .github/workflows/ci.yml       # Automated CI pipeline
└── LICENSE                         # MIT License

🧪 Testing

# Run syntax & import validation
python -m py_compile app.py
python -m py_compile deepfake_detector.py

# Run the test client (if available)
python test_api.py

🛡️ Security Considerations

API key authentication via x-api-key header
Audio data processed in-memory (no disk persistence)
Base64 input validation to prevent injection attacks
Rate limiting recommended for production deployments

📄 License

This project is licensed under the MIT License.

Built for the AI Voice Detection Hackathon — Defending authenticity in the age of synthetic media.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Audio Call Analyser — Multilingual Voice Deepfake Detector

📌 Problem Statement

🧠 Model Architecture

✨ Key Features

🚀 Quick Start

Local Setup

🐳 Docker Setup

🔌 API Reference

`POST /detect` or `POST /api/voice-detection`

`GET /health`

📂 Project Structure

🧪 Testing

🛡️ Security Considerations

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
deepfake_detector.py		deepfake_detector.py
deepfake_model_multilingual.pt		deepfake_model_multilingual.pt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎙️ Audio Call Analyser — Multilingual Voice Deepfake Detector

📌 Problem Statement

🧠 Model Architecture

✨ Key Features

🚀 Quick Start

Local Setup

🐳 Docker Setup

🔌 API Reference

POST /detect or POST /api/voice-detection

GET /health

📂 Project Structure

🧪 Testing

🛡️ Security Considerations

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /detect` or `POST /api/voice-detection`

`GET /health`

Packages