Skip to content

BrunoSantos751/TranslatedSubGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TranslatedSubGen

TranslatedSubGen is a modular tool that:

  • Extracts audio from a video
  • Automatically transcribes using Whisper
  • Translates to another language using the DeepL API
  • Generates a synchronized .srt subtitle file

⚙️ Requirements

  • Python 3.8+
  • FFmpeg installed and in PATH (required by moviepy and whisper)

📦 Installation

  1. Clone the repository or download the files:
git clone https://github.com/BrunoSantos751/TranslatedSubGen.git
cd TranslatedSubGen
  1. Install the necessary Python libraries using pip:
pip install -r requirements.txt
  1. Copy the example environment file and configure your keys:
cp .env.example .env

Edit your .env file with your configuration:

DEEPL_AUTH_KEY=your_deepl_api_key_here
SOURCE_LANG=KO
TARGET_LANG=PT-BR

🛠️ .env Configuration

Variable Purpose
DEEPL_AUTH_KEY Your API key for DeepL
SOURCE_LANG Language code of the video audio (e.g., JA, EN, KO)
TARGET_LANG Desired subtitle language (e.g., PT-BR, EN, FR)
WHISPER_MODEL Ensure you use tiny, base, small, medium, or large.

▶️ How to use

Place your video file named as "target_video.mp4" in the same folder as the scripts and run:

python main.py

This will:

  1. Extract audio from the video
  2. Generate base transcription using Whisper
  3. Format, merge and adjust the duration of subtitle blocks
  4. Translate the transcript using DeepL API
  5. Create a synchronized .srt subtitle file

🖨️ Output Example

The result is a properly formatted target_video_subtittle.srt file containing synchronized translations with word wrap adjustments for optimal readability:

1
00:00:00,000 --> 00:00:14,560
Olá a todos! Bem-vindos a esta prática auditiva em inglês. Este
vídeo é para iniciantes em inglês

2
00:00:14,560 --> 00:00:26,120
Alunos. Você pode ouvir e aprender inglês comigo. Você está
pronto? Vamos começar!

3
00:00:26,120 --> 00:00:35,480
Vamos falar sobre a vida cotidiana. O que você faz todos os dias?
Vou lhe contar sobre meu dia, de manhã,

📁 File Structure

File Purpose
main.py Main orchestrator script
transcription.py Audio extraction and Whisper integration
translation.py DeepL API translation logic
subtitle.py Subtitle processing and SRT generation
config.py Environment and global configurations
requirements.txt List of Python dependencies
.env.example Example of configuration variables
.env Sensitive and configuration variables
target_video.mp4 Input video (rename as needed)
target_video_subtittle.srt Generated translated subtitle file

🧠 Main Features

  • ✅ Uses Whisper (OpenAI) for automatic speech transcription
  • ✅ Uses DeepL API for high-quality translations
  • ✅ Subtitle optimization:
    • Merges short adjacent segments
    • Sets minimum and maximum subtitle durations
    • Limits line and subtitle character length
  • ✅ Uses .env to keep API keys and languages configurable and secure

📝 Notes

  • Whisper transcription time depends on model size and video length. Use smaller models for faster processing if you do not have a dedicated GPU.
  • If you use the Free DeepL API tier, note that URLs end in .deepl.com, but the Python library handles that automatically. Just provide your standard API Key.

💡 License

This project is open for personal use. Feel free to adapt it as needed.

About

TranslatedSubGen is a tool that extracts audio from a video, transcribes it using OpenAI’s Whisper, and translates the subtitles into your chosen language via DeepL. It outputs a ready-to-use .srt file with accurate and natural subtitles for multilingual content.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages