Cascade

Podcast automation pipeline that turns raw recordings into publish-ready shorts, longform video, and an RSS podcast feed. Supports single-camera or multi-camera setups with external multi-track audio (Zoom H6E or similar).

What It Does

Cascade runs a 14-agent pipeline:

Ingest — Copy media from SD card(s) to SSD, validate with ffprobe, sync external audio
Stitch — Concatenate clips via ffmpeg stream-copy
Audio Analysis — Detect true stereo vs identical/mono channels
Speaker Cut — Segment speakers via per-channel RMS energy (supports N-speaker multi-track)
Transcribe — Deepgram Nova-3 with diarization + SRT generation
Clip Miner — Claude identifies top 10 short-form candidates
Longform Render — 16:9 speaker-cropped video with hardware encoding
Shorts Render — 9:16 shorts with burned-in subtitles
Metadata Gen — Per-platform titles, descriptions, hashtags, schedule
Thumbnail Gen — AI-generated caricature artwork via OpenAI
QA — Validate all outputs (durations, file sizes, formats)
Podcast Feed — Extract audio, generate RSS, upload to Cloudflare R2
Publish — Distribute to YouTube, TikTok, Instagram, and more
Backup — rsync episode to external HDD

Agents run in parallel where possible (transcribe runs alongside audio analysis + speaker cut).

Quick Start

Prerequisites

Python 3.11+
ffmpeg with libass (for subtitle burning) — brew install ffmpeg or brew install homebrew-ffmpeg/ffmpeg/ffmpeg --with-libass
uv (recommended) — brew install uv

Setup

git clone https://github.com/saml212/cascade.git && cd cascade
cp config/config.example.toml config/config.toml  # Edit paths & podcast info
cp .env.example .env                               # Fill in your API keys (see below)
./start.sh                                         # Creates venv, installs deps, opens UI

Or manually:

python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp config/config.example.toml config/config.toml   # Edit paths & podcast info
cp .env.example .env                               # Fill in API keys

API Keys

Key	Required	Purpose
`ANTHROPIC_API_KEY`	Yes	Claude — clip mining, metadata generation, chat
`DEEPGRAM_API_KEY`	Yes	Nova-3 transcription + speaker diarization
`OPENAI_API_KEY`	No	Thumbnail generation (caricature artwork)
`YOUTUBE_CLIENT_ID`	No	YouTube publishing
`YOUTUBE_CLIENT_SECRET`	No	YouTube publishing
`TIKTOK_CLIENT_KEY`	No	TikTok publishing
`TIKTOK_CLIENT_SECRET`	No	TikTok publishing
`INSTAGRAM_ACCESS_TOKEN`	No	Instagram publishing
`FACEBOOK_PAGE_ID`	No	Instagram publishing
`CLOUDFLARE_ACCOUNT_ID`	No	Podcast RSS feed (R2 storage)
`CLOUDFLARE_API_TOKEN`	No	Podcast RSS feed (R2 storage)
`UPLOAD_POST_API_KEY`	No	Upload-Post publishing
`UPLOAD_POST_USER`	No	Upload-Post publishing

Only ANTHROPIC_API_KEY and DEEPGRAM_API_KEY are required for the core pipeline (ingest through QA). Publishing and RSS keys are only needed for those specific agents.

Run the Pipeline

# Full pipeline from SD card
python -m agents --source-path "/path/to/media/"

# Specific agents only
python -m agents --source-path "/path/to/media/" --agents ingest stitch audio_analysis

# With a custom episode ID
python -m agents --source-path "/path/to/media/" --episode-id ep_2026-02-19_120000

Run the Web UI

./start.sh
# Opens http://localhost:8420 automatically

The web UI lets you review clips, approve/reject them, trim boundaries, chat with the AI about your episode, and trigger pipeline runs.

Architecture

cascade/
├── agents/          # 14 pipeline agents (DAG-parallel execution)
│   ├── base.py      # BaseAgent ABC (timing, logging, JSON I/O, config helpers)
│   ├── pipeline.py  # DAG orchestrator with dependency-aware parallelism
│   ├── ingest.py → stitch.py → audio_analysis.py → speaker_cut.py
│   ├── transcribe.py (runs parallel to audio_analysis + speaker_cut)
│   ├── clip_miner.py → shorts_render.py + metadata_gen.py (parallel)
│   ├── longform_render.py (starts when speaker_cut + transcribe finish)
│   ├── thumbnail_gen.py → qa.py → podcast_feed.py → publish.py → backup.py
│   └── ...
├── lib/             # Shared utilities
│   ├── encoding.py  # VideoToolbox / libx264 encoder selection + LUT support
│   ├── ffprobe.py   # ffprobe wrapper
│   ├── audio_mix.py # Multi-track audio mixing with per-track volume control
│   ├── paths.py     # Path resolution (external drive fallback)
│   ├── clips.py     # Clip normalization
│   └── srt.py       # SRT generation, parsing, and ffmpeg escaping
├── server/          # FastAPI app (port 8420)
│   ├── app.py       # Entry point + static files
│   └── routes/      # API endpoints (episodes, clips, pipeline, chat, trim, etc.)
├── frontend/        # Vanilla JS SPA for clip review + chat + audio mix panel
├── config/          # config.toml — all settings
├── tests/           # pytest + Jest test suites
└── start.sh         # One-command setup + launch

Storage

By default, Cascade stores everything locally in ./episodes/ and ./work/. This works out of the box with no external drives.

For large episodes (multi-GB source files), you can point to an external SSD by editing config/config.toml:

[paths]
output_dir = "~/cascade/episodes"
work_dir = "~/cascade/work"
backup_dir = "~/cascade/backup"

If an external drive path is configured but the volume isn't mounted, Cascade automatically falls back to local storage.

Configuration

All settings live in config/config.toml. Key sections:

[paths] — Output directory, work directory, backup drive (local fallback if drive missing)
[processing] — CRF, resolution, clip duration limits, hardware acceleration
[transcription] — Deepgram model, language, diarization settings
[clip_mining] — LLM model, temperature, clip count
[schedule] — Shorts posting cadence, peak days, timezone
[platforms.*] — Per-platform publishing settings
[podcast] — RSS feed metadata (title, author, artwork)
[podcast.links] — Link-in-bio page URLs (see below)

Links Page (Link-in-Bio)

Cascade includes a built-in link-in-bio page generator. Fill in the [podcast.links] section of your config with your platform URLs, then generate the static HTML:

python -m links.generate

This produces links/index.html — a single-file, dark-themed page with your podcast artwork, platform links, and an embedded Spotify player. Deploy it to Cloudflare Pages, GitHub Pages, Netlify, or any static host.

Supported platforms: Spotify, Apple Podcasts, YouTube, Instagram, X, TikTok, iHeartRadio, GitHub. Empty URLs are automatically excluded.

API Costs per Episode

Service	Cost
Deepgram transcription	~$0.50
Claude clip mining	~$0.10-0.30
Claude metadata	~$0.05-0.10

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cascade

What It Does

Quick Start

Prerequisites

Setup

API Keys

Run the Pipeline

Run the Web UI

Architecture

Storage

Configuration

Links Page (Link-in-Bio)

API Costs per Episode

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
agents		agents
config		config
frontend		frontend
lib		lib
links		links
server		server
skills/cascade-podcast-automation		skills/cascade-podcast-automation
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

Cascade

What It Does

Quick Start

Prerequisites

Setup

API Keys

Run the Pipeline

Run the Web UI

Architecture

Storage

Configuration

Links Page (Link-in-Bio)

API Costs per Episode

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages