Add 60db TTS provider (Hindi/Indian voices) — live-API-verified by ConalMullan · Pull Request #27 · digitalsamba/claude-code-video-toolkit

ConalMullan · 2026-06-08T17:26:14Z

Adds 60db (https://60db.ai) as a third TTS provider alongside ElevenLabs and Qwen3. Builds on @manishEMS47's work in #26 (their original commit is preserved here) with fixes to make it work against the live 60db API, plus end-to-end verification.

Why 60db

Fills a real gap in the toolkit: native Hindi + Indian-accented English voices, cheaper than ElevenLabs ($0.00002/char) and faster (RTF ~0.22). Qwen3 is English/Chinese-leaning and ElevenLabs is premium-priced for Indic languages. Verified live that the default voice produces good-quality English and Hindi (Devanagari) speech.

The fix (on top of #26)

The original integration coded to 60db's documented /tts-synthesize contract (single JSON {audio_base64}, mp3). In production the endpoint actually streams newline-delimited JSON of raw 48 kHz PCM (Content-Type: application/x-ndjson) with a trailing {metadata} line — so the default path failed with "Invalid JSON response."

tools/sixtydb_tts.py:

Rewrote _synthesize_rest to consume the NDJSON PCM stream, while still accepting the documented single-JSON shape if 60db ships it (defensive both ways).
Added _finalize_audio — sniffs bytes for an audio container (mp3/wav/ogg/flac) and writes/transcodes as-is, else wraps raw PCM as WAV and transcodes to --output-format via ffmpeg.
Added _derive_pcm_sample_rate — infers rate from byte-count ÷ metadata.audio_sec instead of hardcoding.
Surfaces 60db metadata.warnings; routed _synthesize_stream through the same PCM finalizer and flagged that /tts-stream currently returns HTTP 500 upstream.

Verified live ✅

sixtydb_tts.py → valid MP3 (ID3v2.4, 48 kHz mono), EN + Hindi
voiceover.py --provider 60db --scene-dir → correct per-scene MP3s + the JSON shape sync_timing.py consumes
Compiles clean (Python 3.9 compatible), dry-run works

Not verified

redub.py --tts-provider 60db — needs an ElevenLabs key (Scribe STT) + a video; delegation logic is straightforward but untested end-to-end.
websocket transport — matches docs, not needed for batch voiceover (minor wss:// vs documented ws:// discrepancy to confirm).

Closes #26.

🤖 Generated with Claude Code

The original integration coded to 60db's documented /tts-synthesize contract (single JSON object with `audio_base64` in the requested container format). In production the endpoint instead streams newline-delimited JSON of raw 16-bit mono PCM (Content-Type: application/x-ndjson) with a trailing `{metadata}` line, so the default voiceover path failed with "Invalid JSON response". Changes (tools/sixtydb_tts.py): - Rewrite _synthesize_rest to consume the NDJSON PCM stream, while still accepting the documented single-JSON shape if 60db ships it. - Add _finalize_audio: sniff bytes for an audio container (mp3/wav/ogg/flac) and write/transcode as-is, else wrap raw PCM as WAV and transcode to the requested --output-format via ffmpeg. - Add _derive_pcm_sample_rate: infer the rate from byte-count and the metadata audio_sec (snap to nearest standard rate) instead of hardcoding. - Surface 60db metadata warnings; route _synthesize_stream through the same PCM-aware finalizer and flag that /tts-stream currently 500s upstream. Verified live: sixtydb_tts.py and `voiceover.py --provider 60db --scene-dir` both produce valid 48kHz MP3s for English and Hindi. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

+    finally:
+        try:
+            ws.close()
+        except Exception:


ConalMullan · 2026-06-08T17:36:37Z

@manishEMS47 — heads up: I built directly on your #26 work here (your original commit is preserved with your authorship) and added one fix so it works against the live 60db API. The only issue was that the code followed 60db's documented /tts-synthesize response shape (single JSON audio_base64), but in production the endpoint streams NDJSON of raw 48 kHz PCM — so I made the parsing handle both and transcode to the requested format. Tested end-to-end (English + Hindi → valid MP3, plus voiceover.py --provider 60db).

Would love a quick look if you have a moment — happy to adjust anything. If I don't hear back in a day or two I'll go ahead and merge so it doesn't stall. Thanks again for adding this — the Hindi/Indian-voice support fills a real gap for us. 🙌

ConalMullan · 2026-06-17T21:56:27Z

Hi @manishEMS47 — wanted to loop you back in before we take this further.

Quick status: I built on your #26 work to get 60db running against the live API. The main thing was that the production /tts-synthesize endpoint doesn't match the documented contract — instead of a single JSON {audio_base64} mp3, it streams newline-delimited JSON of raw 48 kHz PCM with a trailing metadata line. So the original path failed with "Invalid JSON response." I rewrote the synth path to consume the NDJSON PCM stream (while still accepting the documented single-JSON shape if 60db ships it later), added container-sniffing + PCM→WAV finalization, and infer the sample rate from the metadata. It's verified end-to-end now — clean MP3 output in both English and Hindi, and per-scene voiceover that plugs into the rest of the toolkit.

Once this merges, 60db becomes a first-class TTS provider in the toolkit alongside ElevenLabs and Qwen3 — native Hindi/Indic voices is a genuine gap it fills, so we're keen to ship it well.

Before we do, there are a few things only your side can confirm, and I'd rather get your input than guess:

/tts-stream returns HTTP 500 upstream right now — is that a known issue, and is a fix coming? Batch voiceover doesn't need it, but I've flagged it in the code.
wss:// vs ws:// — the docs say ws:// but the live endpoint looks like wss://. Which is canonical?
redub.py --tts-provider 60db — the delegation logic is straightforward but I couldn't test it end-to-end (needs an ElevenLabs Scribe key + a video). If you're able to run it once, that'd close the last gap.

No rush, but I'd like to have you back in the loop before merging — it's your integration as much as ours, and a quick confirmation on the above (especially the stream 500) would let us ship it with confidence. Happy to jump on anything if it's easier.

Thanks again for kicking this off!

manishEMS47 and others added 2 commits June 8, 2026 16:16

Added 60dB integration

b6ade7b

ConalMullan mentioned this pull request Jun 8, 2026

Added 60dB integration #26

Closed

github-code-quality Bot found potential problems Jun 8, 2026

View reviewed changes

Comment thread tools/sixtydb_tts.py

finally:

try:

ws.close()

except Exception:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 60db TTS provider (Hindi/Indian voices) — live-API-verified#27

Add 60db TTS provider (Hindi/Indian voices) — live-API-verified#27
ConalMullan wants to merge 2 commits into
mainfrom
feat/sixtydb-tts-integration

ConalMullan commented Jun 8, 2026

Uh oh!

ConalMullan commented Jun 8, 2026

Uh oh!

ConalMullan commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ConalMullan commented Jun 8, 2026

Why 60db

The fix (on top of #26)

Verified live ✅

Not verified

Uh oh!

ConalMullan commented Jun 8, 2026

Uh oh!

ConalMullan commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant