Ultra-fast text-to-speech (CLI + optional desktop GUI): smart chunking, parallel generation, clipboard input, optional streaming playback, and a --check-deps sanity check for ffmpeg and players. Auto-optimized by default. Languages: Georgian (🇬🇪), Russian (🇷🇺), English (🇬🇧).
✨ Simplified UX: Auto-optimization is now enabled by default. Just specify
--langand go!
- 🚀 Ultra-Fast Generation: 6-15 seconds for 1000 words (vs 25+ seconds traditional)
- 🔊 Streaming Playback: Audio starts playing while still generating (NEW!)
- 🧠 Smart Chunking: Automatic text splitting for optimal performance
- ⚡ Parallel Processing: Multi-threaded generation with up to 8 workers
- 📋 Clipboard Integration: Direct clipboard-to-speech workflow
- 🎯 Auto-Optimization: Turbo mode automatically optimizes all settings
- 🎵 High-Quality Voices: Premium neural voices for all languages
- 📁 File Support: Process text files directly
- 🔄 Real-time Playback: Automatic audio playback with system player
- Dependency check:
python -m TTS_ka --check-depsreports ffmpeg, streaming players (VLC/mpv/ffplay), and Python packages; exits with code 1 if critical pieces are missing. - Optional GUI:
TTS_ka-gui(tkinter) — Speak tab (paste or UTF-8 file path), Config tab (JSON path, defaults, Save/Reload), and on Windows Windows shell (install/uninstall Explorer context menu viaextras/windows/context_menu/Install-TTS_ka-ContextMenu.ps1when that file is available next to the repo). - Native global hotkeys (Windows, no AutoHotkey):
pip install "TTS_ka[hotkeys]"then runTTS_ka-hotkeysor enable hotkeys on the GUI Windows shell tab. Defaults: Ctrl+Alt+1–4 map to en / ru / ka / ka-m; override in your JSON config under"hotkeys"(pynput combo string → language code; JSONnullremoves a default). Seeextras/tts_config.example.json. Each press runspython -m TTS_ka clipboard --lang …in a new process (pynput optional extra). - Speakable text cleanup: Before TTS, the pipeline rewrites noisy input so the voice does not read raw syntax — fenced and inline code, URLs, shebang lines, HTML-like tags, file extensions (for example
.ts→ “TypeScript”), common IT acronyms (HTTPS, JSON, API, …), math symbols (for example⇒→ “implies”), and very long digit runs. Implemented inTTS_ka.not_reading(replace_not_readable). - Ctrl+C: Cancels generation and stops active streaming playback (including VLC) without waiting for the full join timeout.
# Install from PyPI (recommended)
pip install TTS_ka
# Or install from source
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .Verify ffmpeg is on your PATH (required for merging chunks and reliable MP3 handling). Then:
python -m TTS_ka --check-depsYou should see [OK] for edge-tts, pydub, and ffmpeg. A streaming player (VLC, mpv, …) is optional unless you use --stream.
Optional desktop window (paste → Speak):
TTS_ka-gui
# or: python -m TTS_ka.guiOn Debian/Ubuntu, install Tk if needed: sudo apt install python3-tk.
The GUI picks a system font that supports Georgian and Cyrillic (prioritising Segoe UI / Sylfaen on Windows and Noto Sans / Noto Sans Georgian on Linux). Symbol-only fonts such as Noto Sans Symbols 2 are avoided: they often lack Mkhedruli letters, which would show as ? in the text box.
# Ultra-fast generation with auto-optimization (default behavior)
python -m TTS_ka "Hello, how are you today?" --lang en
# Georgian text with automatic optimization
python -m TTS_ka "გამარჯობა, როგორ ხართ?" --lang ka
# Russian text with smart chunking
python -m TTS_ka "Привет, как дела?" --lang ru# Copy any text, then run (fastest workflow):
python -m TTS_ka clipboard --lang en
# For different languages:
python -m TTS_ka clipboard --lang ka # Georgian
python -m TTS_ka clipboard --lang ru # Russian# Process text files directly (auto-optimized)
python -m TTS_ka document.txt --lang en
# Long files with custom settings
python -m TTS_ka large_file.txt --chunk-seconds 30 --parallel 6 --lang ru$ pip install TTS_ka
$ python -m TTS_ka --check-deps
TTS_ka dependency check
========================================
[OK] edge-tts import ok (…)
[OK] pydub import ok (…)
[OK] ffmpeg ffmpeg version …
[opt] soundfile optional … # faster merges if installed
[OK] streaming player first match: vlc # [opt] if none — only needed for --stream
$ python -m TTS_ka "Hello from TTS_ka" --lang en
OPTIMIZED MODE - English
…
⚡ Completed in …s (direct)
$ python -m TTS_ka clipboard --lang ka # after copying Georgian text to the clipboard
…
$ TTS_ka-gui # optional: paste text in the window and click Speak
(Timings and exact log lines depend on your machine and network.)
python -m TTS_ka [TEXT_SOURCE] [OPTIONS]
- Direct text:
"Your text here" - Clipboard:
clipboard(copy text first) - File path:
file.txt,document.md, etc.
| Option | Description | Examples |
|---|---|---|
--lang |
ka Georgian (female), ka-m Georgian (male), ru, en |
--lang ka |
-o, --output |
Output MP3 path (default data.mp3) |
-o speech.mp3 |
--stream |
🆕 Enable streaming playback (audio starts while generating) | --stream |
--chunk-seconds |
Chunk size in seconds (0=auto, 20-60 optimal) | --chunk-seconds 30 |
--parallel |
Workers (0=auto, 2-8 recommended) | --parallel 6 |
--no-play |
Skip automatic audio playback | --no-play |
--no-gui |
With --stream: headless VLC (dummy UI). Default is one GUI window on Windows. |
--stream --no-gui |
--no-turbo |
Disable auto-optimization (legacy mode) | --no-turbo |
--help-full |
Show comprehensive help with examples | --help-full |
-V, --version |
Print version, Python, platform, and PyPI package metadata | --version |
--check-deps |
Print dependency status (ffmpeg, players, Python stack); exit code 1 if critical deps missing | --check-deps |
| Kind of input | What you hear instead |
|---|---|
```code``` / `inline` |
Short phrases like “omitted fenced code block” / “omitted inline code snippet” |
https://… / www.… |
“omitted hyperlink” |
#!/usr/bin/env python |
“omitted script shebang line” |
<div>…</div>-style tags |
“omitted markup tag” |
file.ts, app.py |
Spoken language or format name (TypeScript, Python, …) |
API, HTTPS, JSON, … |
Letter-by-letter or expanded forms (A P I, H T T P S, …) |
=>, ≤, ∞, … |
Words (“implies”, “less than or equal to”, “infinity”, …) |
| 7+ digit numbers | “a large number” |
Chunk playback order matches document order even when chunks finish generating in parallel.
- Traditional TTS: 25-40 seconds
- TTS_ka Direct: 15-25 seconds
- TTS_ka Turbo: 8-15 seconds
- TTS_ka Chunked: 6-12 seconds ⚡
- TTS_ka Streaming: 🔊 2-3 seconds to first audio (NEW!)
The new streaming feature starts playing audio within 2-3 seconds while the rest continues generating in the background. This provides an 85-90% reduction in perceived wait time!
Quick Usage:
# Basic streaming - audio starts almost instantly!
python -m TTS_ka "Your long text..." --lang en --stream
# From file with streaming
python -m TTS_ka article.txt --lang ka --stream
# Clipboard with streaming (fastest workflow)
python -m TTS_ka clipboard --streamHow It Works:
- Text is split into chunks (if needed)
- Chunks generate in parallel (2-8 workers)
- First chunk plays quickly (~2-3 seconds); with VLC (default on Windows), one window builds a playlist in text order as chunks finish (
--no-guiuses a headless session). SetTTS_KA_VLC_RC=0to fall back to launching VLC once per chunk instead of one remote-control session. - Remaining chunks continue generating in background
- Final merged audio file is saved
Performance:
- Without streaming: Wait 10-30+ seconds for all audio
- With streaming: Hear audio in 2-3 seconds ⚡
- Platform support: Windows, Linux, macOS
Advanced Streaming:
# Custom chunking for optimal streaming
python -m TTS_ka longtext.txt --stream --chunk-seconds 25 --parallel 6
# Streaming without final playback
python -m TTS_ka text.txt --stream --no-play# 1. Quick phrases (instant generation)
python -m TTS_ka "Thank you very much!" --lang en
# ⚡ Completed in 2.3s (optimized)
# 2. Medium text (paragraph)
python -m TTS_ka "Lorem ipsum dolor sit amet..." --lang en
# ⚡ Completed in 5.7s (direct)
# 3. Long document (chunked processing)
python -m TTS_ka large_document.txt --lang en
# Strategy: chunked generation, 6 workers
# ⚡ Completed in 12.4s (chunked)
# 4. Clipboard workflow (daily usage)
python -m TTS_ka clipboard --lang ka
# OPTIMIZED MODE - Georgian
# Processing: 45 words, 287 characters
# ⚡ Completed in 4.1s| Language | Code | Voice Quality | Speed | Example |
|---|---|---|---|---|
| Georgian 🇬🇪 | ka |
Neural (Eka, female) | Fast | --lang ka |
| Georgian 🇬🇪 | ka-m |
Neural (Giorgi, male) | Fast | --lang ka-m |
| Russian 🇷🇺 | ru |
High Quality | Very Fast | --lang ru |
| English 🇬🇧 | en |
Premium Neural | Maximum | --lang en |
- Georgian (female):
ka-GE-EkaNeural—--lang ka - Georgian (male):
ka-GE-GiorgiNeural—--lang ka-m - Russian:
ru-RU-SvetlanaNeural- High-quality female voice - English:
en-GB-SoniaNeural- British English neural voice
# Manual chunking for very long texts
python -m TTS_ka book_chapter.txt --chunk-seconds 45 --parallel 4 --lang en
# Maximum parallelization (for powerful systems)
python -m TTS_ka large_text.txt --parallel 8 --lang ru
# Batch processing (no audio playback)
python -m TTS_ka document.txt --no-play --lang ka
# Legacy mode (disable auto-optimization)
python -m TTS_ka "text" --no-turbo --lang en# Create alias for daily use
alias speak='python -m TTS_ka clipboard --lang en'
# Windows batch file (speak.bat)
@echo off
python -m TTS_ka clipboard --lang en
# Read web articles (with browser copy)
# 1. Copy article text
# 2. Run: python -m TTS_ka clipboard --lang en- Python: 3.9+ (required: async CLI,
httpx, and PEP 639 build metadata) - OS: Windows, macOS, Linux
- Memory: 256MB+ available RAM
- Network: Internet connection for voice synthesis
Required (same as pip install TTS_ka):
pip install "edge-tts>=7.2.7" # Core TTS engine
pip install pydub>=0.25.1 # Audio processing
pip install tqdm>=4.65.0 # Progress bars
pip install "httpx>=0.28.1" # Async HTTP (CLI)System Requirements:
- FFmpeg: Required for audio processing
- Windows: Download from ffmpeg.org
- macOS:
brew install ffmpeg - Ubuntu:
sudo apt install ffmpeg
# Method 1: PyPI installation (simplest)
pip install TTS_ka
# Method 2: Development installation
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e .
# Method 3: Manual dependencies
pip install "edge-tts>=7.2.7" pydub tqdm "httpx>=0.28.1"
# Verify installation
python -m TTS_ka "Installation successful!" --turbo --lang enBundled scripts live under extras/autohotkey/: a commented template (TTS_ka_hotkeys.ahk) and a Startup installer (Install-TTS_ka-Hotkeys.ps1). Defaults match the old readme: Alt+E / Alt+R / Alt+X for English, Russian, Georgian (clipboard).
- Install AutoHotkey v2 (64-bit is typical).
- From the repository root, run PowerShell:
powershell -ExecutionPolicy Bypass -File .\extras\autohotkey\Install-TTS_ka-Hotkeys.ps1This copies TTS_ka_hotkeys.ahk into your user Startup folder and launches it. Re-run the same command after you edit the script in the repo to refresh the Startup copy.
Options:
| Flag | Meaning |
|---|---|
-WhatIf |
Print paths only; no copy/start |
-NoStart |
Copy to Startup but do not launch now |
-Uninstall |
Remove the script from Startup |
- Confirm Python works in a new Command Prompt:
python -m TTS_ka --version(use the samepython/pyyou set ing_Pythoninside the.ahkfile).
- Copy
extras/autohotkey/TTS_ka_hotkeys.ahkanywhere (e.g.%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup\). - Double-click the
.ahkfile (or right-click → Run with AutoHotkey).
Open TTS_ka_hotkeys.ahk in a text editor. At the top, set g_Python, g_CopyFirst (send Ctrl+C before TTS), g_ExtraFlags (e.g. --stream), and g_CmdKeepOpen. Further down, many hotkeys and variants are commented with ; — delete the semicolon on the lines you want.
- Copy (or highlight and set
g_CopyFirst := true) your text - Alt+E / Alt+R / Alt+X → speech in that language
- Right-click the green H tray icon → Reload / Exit
Inside Chrome, Edge, Word, etc., Windows does not let third parties add a “Read” item to the native right‑click menu for a text selection (that menu is drawn by each app). Two supported options:
-
AutoHotkey (in-app) — with
TTS_ka_hotkeys.ahkloaded: select text, then either press the Menu / Apps key (next to Right Ctrl) or Ctrl+Alt+right‑click; a small language menu appears at the cursor (the script sends Ctrl+C first). Comment those lines in the script if they clash with other tools. -
Explorer / Desktop context menu — after Ctrl+C, right‑click empty space in a folder window or on the desktop, then Read with TTS_ka → choose a language (nested menu). Installer:
powershell -ExecutionPolicy Bypass -File .\extras\windows\context_menu\Install-TTS_ka-ContextMenu.ps1| Flag | Meaning |
|---|---|
-FlatMenu |
One top-level item per language instead of a submenu |
-Languages @('en','ru') |
Subset of languages (PowerShell array) |
-IncludeTextFiles |
Add “read this file” on .txt right‑click |
-Uninstall |
Remove TTS_ka menu entries |
On Windows 11, classic shell entries may appear under Show more options.
1. "No module named 'edge_tts'"
pip install "edge-tts>=7.2.7"2. "FFmpeg not found"
# Windows: Download and add to PATH
# macOS: brew install ffmpeg
# Linux: sudo apt install ffmpeg3. Slow generation
# Auto-optimization is enabled by default
python -m TTS_ka "text" --lang en
# Reduce parallel workers if network issues
python -m TTS_ka "text" --parallel 2 --lang en
# Use legacy mode only if needed
python -m TTS_ka "text" --no-turbo --lang en4. Empty clipboard
# Ensure text is copied first
# Then run: python -m TTS_ka clipboard --turbo --lang en5. 403 / Invalid response status (HTTP or edge-tts)
# Microsoft rotates access; upgrade edge-tts (includes updated websocket tokens)
pip install -U "edge-tts>=7.2.7"
# Optional: skip the unofficial Bing HTTP path and use edge-tts only
set TTS_KA_SKIP_HTTP=1 # Windows CMD
# export TTS_KA_SKIP_HTTP=1 # macOS / Linux
# Optional: log when the app falls back from HTTP to edge-tts (off by default)
set TTS_KA_VERBOSE=1
# If many parallel chunks still fail, reduce workers
python -m TTS_ka "your long text" --lang en --parallel 26. Streaming / VLC (Windows)
- Default: one VLC window with a growing playlist (TCP remote control).
TTS_KA_VLC_RC=0: disable that mode and use one VLC process per chunk (legacy).
7. Ctrl+C
Press Ctrl+C to cancel synthesis and stop streaming playback; partial part files are cleaned up.
For Maximum Speed:
# Use these exact settings for best performance (auto-optimized by default)
python -m TTS_ka clipboard --chunk-seconds 30 --parallel 6 --lang enFor System with Limited Resources:
# Reduce workers and chunk size
python -m TTS_ka text --parallel 2 --chunk-seconds 60 --lang en| Words | Direct Mode | Turbo Mode | Chunked (6 workers) |
|---|---|---|---|
| 10-50 | 2-4s | 1-3s | 2-4s |
| 100-300 | 8-12s | 5-8s | 4-6s |
| 500-1000 | 18-25s | 12-15s | 8-12s |
| 1000+ | 30-45s | 18-25s | 10-18s |
# Short text (< 100 words): Direct generation (auto-optimized)
python -m TTS_ka "short text" --lang en
# Medium text (100-500 words): Auto-optimized mode
python -m TTS_ka medium_text.txt --lang en
# Long text (500+ words): Chunked processing (auto-detected)
python -m TTS_ka long_text.txt --chunk-seconds 30 --parallel 6 --lang en1. Article Reading
# Copy web article → instant speech
python -m TTS_ka clipboard --lang en2. Document Processing
# Process research papers, books, etc.
python -m TTS_ka research_paper.pdf.txt --lang en3. Language Learning
# Practice pronunciation with different languages
python -m TTS_ka "სწავლობდი ქართულს" --lang ka
python -m TTS_ka "Learning Russian язык" --lang ru4. Accessibility
# Screen reader alternative
python -m TTS_ka clipboard --no-play --lang en > audio_file.mp3# Process multiple files
for file in *.txt; do
python -m TTS_ka "$file" --no-play --lang en
done
# Windows batch processing
for %f in (*.txt) do python -m TTS_ka "%f" --no-play --lang en# Set default language
export TTS_DEFAULT_LANG=ka
# Set default mode
export TTS_DEFAULT_MODE=turbo
# Custom output directory
export TTS_OUTPUT_DIR=/path/to/audio/filesCreate ~/.tts_config.json:
{
"default_lang": "en",
"turbo_mode": true,
"chunk_seconds": 30,
"parallel_workers": 6,
"auto_play": true
}#!/usr/bin/env python3
import subprocess
import sys
def text_to_speech(text, lang="en", turbo=True):
"""Convert text to speech using TTS_ka"""
cmd = [
"python", "-m", "TTS_ka",
text,
"--lang", lang
]
if turbo:
cmd.append("--turbo")
subprocess.run(cmd)
# Usage
text_to_speech("Hello world!", "en")
text_to_speech("გამარჯობა!", "ka")# URL to speech (with curl + TTS_ka)
curl -s "https://example.com/article" | \
python -m TTS_ka /dev/stdin --turbo --lang en# Generate audio on remote server
ssh user@server "python -m TTS_ka 'Remote generation' --turbo --no-play"
# Download and play locally
scp user@server:data.mp3 ./remote_audio.mp3FROM python:3.9
RUN pip install TTS_ka
RUN apt-get update && apt-get install -y ffmpeg
ENTRYPOINT ["python", "-m", "TTS_ka"]# Docker usage
docker run tts_container "Hello Docker!" --turbo --lang en- Auto-optimization is enabled by default - no flags needed!
- Use clipboard workflow for fastest daily usage
- Chunk long texts with
--chunk-seconds 30 - Optimize workers with
--parallel 4-6for most systems - Pre-install FFmpeg for best audio processing
- Georgian text: Use
--lang kafor best quality - Mixed languages: Process separately for optimal results
- Technical text: Use shorter chunks (
--chunk-seconds 20) - Clean input: Remove extra whitespace and formatting
- Create aliases for frequent commands
- Use hotkeys (AutoHotkey on Windows)
- Batch process large document collections
- Test settings with small text first
- Text files:
.txt,.md,.rst - Code files:
.py,.js,.html(extracts text) - Clipboard: Any copied text
- Direct input: Command-line strings
- Audio: MP3 (high quality, compressed)
- Bitrate: 128kbps (optimal size/quality balance)
- Sample Rate: 24kHz (neural voice quality)
# Update to latest version
pip install --upgrade TTS_ka
# Check current version
python -m TTS_ka --version
# Update dependencies
pip install --upgrade edge-tts pydub tqdm httpx# Test installation
python -m TTS_ka "System check" --turbo --lang en
# Verify FFmpeg
ffmpeg -version
# Check Python version
python --version # Should be 3.9+We welcome contributions! See our GitHub repository for:
- Bug reports and feature requests
- Code contributions and pull requests
- Documentation improvements
- Language support additions
git clone https://github.com/DavidTbilisi/TTS.git
cd TTS
pip install -e ".[dev]"
pytest # Run tests- Documentation: Use
--help-fullfor comprehensive help - Issues: Report bugs on GitHub Issues
- Discussions: Join GitHub Discussions
# Check system compatibility
python -m TTS_ka --help-full
# Test with minimal command
python -m TTS_ka "test" --turbo --lang en
# Verify FFmpeg installation
ffmpeg -versionLicense: MIT License - see LICENSE file
Credits:
- Edge-TTS: Microsoft's edge-tts library for voice synthesis
- PyDub: Audio processing and manipulation
- FFmpeg: Audio encoding and format conversion
Author: David Chincharashvili (davidchincharashvili@gmail.com)
⭐ Star this project on GitHub if you find it useful!
🐛 Report issues to help improve the tool
🤝 Contribute to make it even better