- Hero-perspective board
- Kokoro TTS narration (
hexgrad/Kokoro-82M, voiceaf_nicole, American English) - Word-level karaoke subtitles (ASS hard-burned)
- Continuous perimeter progress bar
- Final-board outro with fade to black
On a fresh machine you need to:
- Install dependencies:
uv sync- (Optional but recommended) Preload Kokoro into the project-local HF cache:
uv run python -m cgc.tools.bootstrap_modelsThis downloads hexgrad/Kokoro-82M once into:
cgc/.hf_cache/models--hexgrad--Kokoro-82M/...
After that, Hugging Face will reuse the cached model for all runs.
You can also skip the bootstrap step; the first --no-fake-tts pipeline run will trigger the same download automatically.
- Python 3.11+
uvinstalledffmpegavailable on yourPATH- Internet access at least once to download Kokoro (either via bootstrap or first real-TTS run)
Models and alignment assets are cached under cgc/.hf_cache via Hugging Face Hub.
All examples below use --device cpu.
For quick visual checks:
uv run python -m cgc.pipeline scripts/game.yaml --device cpuRecommended normal run:
uv run python -m cgc.pipeline scripts/game.yaml --device cpu --no-fake-tts- TTS: Kokoro KPipeline
- Repo:
hexgrad/Kokoro-82M - Voice:
af_nicole— see available voices - Lang:
"a"(American English) - Speed:
1.0
- Repo:
- Alignment: fake (no WhisperX).
When you want real word timings:
uv run python -m cgc.pipeline scripts/game.yaml --device cpu --no-fake-tts --no-fake-alignmentWhisperX models are cached under cgc/.hf_cache/alignment_models.
To build a video, CGC needs a YAML story script under scripts/.
Flow:
Lichess game URL → YAML story script → scripts/ → pipeline → MP4
Example: scripts/game.yaml
version: 1
source:
type: lichess_url
value: "https://lichess.org/anonymousGameId"
meta:
voice: cinematic-bullet
perspective: black
cards:
- id: intro-1
type: text
duration: 2.8
lines:
- "They were outrated."
- "The clock was against them."
- "Nobody expected an upset."
- id: opening-1
type: board
ply: 1
role: opening
duration: 2.1
highlight:
mode: last_move
lines:
- "A sharp opening choice."
- "No room for quiet play."
- "From move one, both sides were fighting."
# ... midgame cards ...
- id: finish-1
type: board
ply: 66
role: finish
duration: 2.1
highlight:
mode: last_move
lines:
- "One final precise move."
- "And everything collapsed."
- id: outro-1
type: text
duration: 2.8
lines:
- "One game."
- "One chance."
- "Complete domination."Notes:
- Put scripts under
scripts/, e.g.scripts/game.yaml,scripts/other_game.yaml. source.valuemust be a full Lichess game URL.meta.voiceis a style label (currently mapped to Kokoro’saf_nicolein code).meta.perspectivecontrols hero side (whiteorblack) and board flipping.cards:type: board+plyshow specific positions;highlight.mode: last_moveis supported.type: textis narration-only:intro-*ids show the starting position.outro-*ids show the final position.
To render a different script:
uv run python -m cgc.pipeline scripts/other_game.yaml --device cpu --no-fake-ttsAfter a run you’ll see:
- Story JSON:
output/story/<game_id>.json - Frames:
output/frames/<game_id>_XX_<scene_id>.png - Subtitles (ASS):
output/subtitles/<game_id>.ass - Audio clips:
audio/clips/<game_id>/... - Merged audio:
audio/merged/<game_id>.wav - Final video:
output/video/<game_id>.mp4
CGC runs fully on CPU, but TTS + alignment are much faster on a CUDA GPU.
This project uses uv for environment management. To install a CUDA-enabled PyTorch build:
-
Go to the official PyTorch “Get Started” page:
https://pytorch.org/get-started/locally/ -
Select:
- PyTorch build: Stable
- Your OS
- Package:
pip - Compute platform: your CUDA version (e.g. CUDA 12.6)
-
Copy the recommended install command. For CUDA 12.6 it looks like:
pip3 install torch --index-url https://download.pytorch.org/whl/cu126
-
Translate that into
pyproject.toml+ uv:[project] name = "cgc" version = "0.1.0" requires-python = ">=3.11" dependencies = [ "python-chess", "requests", "pyyaml", "numpy", "soundfile", "imageio[ffmpeg]", "cairosvg", "kokoro>=0.9.2", "misaki[en]", "whisperx==3.7.1", "torch", ] [tool.uv] index = [ { name = "pytorch-cu126", url = "https://download.pytorch.org/whl/cu126", explicit = true }, ] [tool.uv.sources] torch = { index = "pytorch-cu126" }
-
Sync and verify:
uv sync uv run python -c "import torch; print(torch.__version__, torch.cuda.is_available())"
You should see something like:
2.11.0+cu126 True
CPU pipeline:
uv run python -m cgc.pipeline scripts/1ORTExZg.yaml --device cpu --no-fake-tts --no-fake-alignmentCUDA pipeline:
uv run python -m cgc.pipeline scripts/1ORTExZg.yaml --device cuda --no-fake-tts --no-fake-alignmentQuick benchmark (CPU vs CUDA):
uv run python scripts/benchmark.py scripts/1ORTExZg.yamlOn an RTX-class GPU we observed approximately:
- CPU: 83.8s
- CUDA: 38.8s
The pipeline has a --device flag that controls where Kokoro TTS and WhisperX run:
cpu→ run all models on the CPUcuda→ run all models on the GPU (requires a CUDA-enabled PyTorch build)
Examples:
# Force CPU (safe on any machine)
uv run python -m cgc.pipeline scripts/game.yaml --device cpu --no-fake-tts --no-fake-alignment
# Force GPU (fastest if torch.cuda.is_available() is True)
uv run python -m cgc.pipeline scripts/game.yaml --device cuda --no-fake-tts --no-fake-alignmentIf you only care about real audio and are OK with fake alignment, just omit --no-fake-alignment in either command.
CGC uses the Kokoro-82M text-to-speech model
by hexgrad (Apache-2.0). See THIRD_PARTY_NOTICES.md for details.
Once models and voices are cached, you can force a fully offline run with:
uv run python -m cgc.pipeline scripts/game.yaml --device cpu --no-fake-tts --no-fake-alignment --offline
Thank you for reading :)
