Skip to content

feat(tts): extend Doubao controls and caption rendering#71

Open
RayJiang4S wants to merge 2 commits into
calesthio:mainfrom
RayJiang4S:codex/doubao-tts-controls
Open

feat(tts): extend Doubao controls and caption rendering#71
RayJiang4S wants to merge 2 commits into
calesthio:mainfrom
RayJiang4S:codex/doubao-tts-controls

Conversation

@RayJiang4S

Copy link
Copy Markdown

Summary

This builds on the merged Doubao Speech provider by adding the controls we needed while producing longer Mandarin narration:

  • expose advanced Doubao request parameters such as model, emotion, emotion_scale, loudness_rate, mute-cut controls, pitch post-processing, and raw escape hatches for newly documented fields
  • document a production workflow for reducing theatrical delivery without rewriting approved narration
  • add caption overlay sizing options for Remotion explainers
  • preserve newlines in text cards and avoid fully transparent entrance frames
  • align Remotion metadata duration with rounded sequence frame math so paused playback lands on the final visual frame

Verification

  • python3 -m py_compile tools/audio/doubao_tts.py
  • git diff --check c2f83f1^ c2f83f1

Add advanced Doubao Speech parameters for production narration control.
Also expose caption sizing options and tighten Remotion duration/card rendering.
@RayJiang4S RayJiang4S requested a review from calesthio as a code owner May 11, 2026 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant