PolyVox - Redefining Voice in Media

PolyVox is an innovative voice cloning and multilingual speech synthesis system developed by Code Red. It aims to revolutionize how dubbing is done by preserving the original actor’s voice and emotional delivery, even when translated into different languages.

Track - Entertainment

📌 Problem Statement

In traditional dubbing, actors' voices are replaced by different voice artists for each language. This leads to a loss of vocal identity, emotional disconnect, and a reduction in viewer immersion. As content becomes more global, these limitations hinder the reach and impact of films, shows, and digital media.

Solution

PolyVox solves this by using AI-powered voice cloning and cross-lingual TTS, allowing actors to speak naturally in multiple languages while preserving their unique vocal characteristics and performance nuances.

It produces emotionally aligned, natural-sounding speech in a translated language using only a short sample of the actor’s original voice.

🔄 Workflow

🎧 Audio Extraction Extract clean speech audio from source video using FFmpeg.

🧠 Speech-to-Text (STT) Transcribe the original dialogue using OpenAI's Whisper.

🌍 Translation Translate the transcribed text to the target language using Google Translate.

🗣️ Voice Cloning & TTS Use models like Xtts-v2, Tortoise-v2, or ChatterBox to synthesize the translated speech in the actor’s original voice.

Tech Stack

Component	Technology Used
Audio Processing	FFmpeg
Speech-to-Text	OpenAI Whisper
Translation	Google Translate API
Voice Cloning & TTS	Xtts-v2, Tortoise-v2, ChatterBox
Backend API	Python, FastAPI
Frontend	React.js

Implementation

Translation: Translates your text to any target language
Voice Cloning: Clones the voice from your reference audio
Speech Generation: Creates speech in the cloned voice with translated text

Check out PolyVox here!

Installation

Step 1: Clone the Repository

git clone https://github.com/yourusername/voice-cloning.git
cd voice-cloning

Step 2: Install FFmpeg

Download FFmpeg
Extract and add to system PATH

Step 3: Create Virtual Environment

python -m venv .venv
.venv\Scripts\activate

Step 4: Install Dependencies

pip install -r requirements.txt

Strp 5: Run

run_fastapi.bat

🔭 Future Scope

PolyVox can evolve into a fully automated multilingual dubbing solution by integrating advanced lip-syncing technologies, enabling synchronized visuals alongside voice cloning. Future improvements may include emotional tone and prosody control for more expressive and natural-sounding speech, support for low-resource languages to increase inclusivity, and real-time or on-device deployment for interactive applications like gaming and AR/VR. Additionally, offering PolyVox as a cloud-based API or SaaS platform can streamline adoption across film, OTT, and media production pipelines.

Code Red

Lana Anvar
Sutharya
Ganga
Lakshmikha Rejith

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
app		app
outputs		outputs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_PIPELINE.md		README_PIPELINE.md
README_WORKFLOW.md		README_WORKFLOW.md
complete_pipeline.py		complete_pipeline.py
dubbing.log		dubbing.log
example_workflow.py		example_workflow.py
integrated_app.py		integrated_app.py
reference_audio-[AudioTrimmer.com].wav		reference_audio-[AudioTrimmer.com].wav
reference_audio.wav		reference_audio.wav
requirements.txt		requirements.txt
run_app.bat		run_app.bat
run_app.py		run_app.py
run_complete_pipeline.bat		run_complete_pipeline.bat
run_fastapi.bat		run_fastapi.bat
run_simple_workflow.bat		run_simple_workflow.bat
run_workflow.bat		run_workflow.bat
setup_and_run.py		setup_and_run.py
setup_complete.bat		setup_complete.bat
setup_pipeline.bat		setup_pipeline.bat
simple_requirements.txt		simple_requirements.txt
simple_voice_clone.py		simple_voice_clone.py
simple_workflow.py		simple_workflow.py
test_setup.bat		test_setup.bat
test_workflow.py		test_workflow.py
video_pipeline.py		video_pipeline.py
video_to_voice_pipeline.py		video_to_voice_pipeline.py
voice_clone_app.py		voice_clone_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PolyVox - Redefining Voice in Media

📌 Problem Statement

Solution

🔄 Workflow

Tech Stack

Implementation

Installation

🔭 Future Scope

Code Red

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PolyVox - Redefining Voice in Media

📌 Problem Statement

Solution

🔄 Workflow

Tech Stack

Implementation

Installation

🔭 Future Scope

Code Red

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages