Skip to content

feat(tts): add segment-level narration audition lab#72

Open
RayJiang4S wants to merge 10 commits into
calesthio:mainfrom
RayJiang4S:codex/tts-segment-lab
Open

feat(tts): add segment-level narration audition lab#72
RayJiang4S wants to merge 10 commits into
calesthio:mainfrom
RayJiang4S:codex/tts-segment-lab

Conversation

@RayJiang4S

Copy link
Copy Markdown

Summary

  • add tts_segment_lab, a segment-level narration audition tool that runs before final TTS asset generation
  • support dry runs, provider-routed generation through tts_selector, reference audio comparison, and selection.json output
  • document the workflow in a new core skill and wire it into the explainer script/asset director guidance

Why

OpenMontage already has tts_selector and sample-preview guidance, but there was no structured way to compare multiple voice/parameter/provider candidates for sensitive narration lines or keep the approved choice available for downstream asset generation. This tool makes that process explicit and repeatable.

Testing

  • /Users/ray/Repositories/persional/video-ai-lab/tools/OpenMontage/.venv/bin/python -m pytest tests/tools/test_tts_segment_lab.py tests/contracts/test_phase3_contracts.py -q

Add a TTS Segment Lab tool for auditioning narration sections before final asset generation. The tool supports dry runs, provider-routed sample generation through tts_selector, reference audio comparison, and selection manifests for downstream reuse.
@RayJiang4S RayJiang4S requested a review from calesthio as a code owner May 12, 2026 06:06
Support repeated TTS audition review cycles until all segments are approved. Defer final selection output while follow-up generation is still pending, and auto-load local .env files for provider credentials without exposing secret values.
@RayJiang4S

Copy link
Copy Markdown
Author

Added a screenshot of the new compare.html review UI.

The latest update also makes the review loop iterative rather than assuming a fixed second round: annotate now reports review_complete and next_operation, apply_review can be repeated until every segment has an approved take, and selection.json is only written after the review is complete.

It also safely auto-loads local .env files for provider credentials and records only the loaded file paths, never secret values.

TTS Segment Lab review UI

Surface failed candidate errors directly in comparison pages and add a merge_retry operation so successful retry runs can refresh the original results and compare page instead of leaving users on a stale audition page.
@RayJiang4S

Copy link
Copy Markdown
Author

Update from the v4 production retrospective:

  • Surface failed generated candidates directly in compare.html with provider error text and attempt count, so missing audio is not mistaken for a usable provider sample.
  • Add merge_retry operation to merge successful retry/follow-up results back into the original results.json, review.md, and compare.html.
  • Document the retry merge workflow in skills/core/tts-segment-lab.md.

Validation:

  • .venv/bin/python -m pytest tests/tools/test_tts_segment_lab.py -q -> 21 passed
  • .venv/bin/python -m pytest tests/tools/test_tts_segment_lab.py tests/contracts/test_phase3_contracts.py -q -> 84 passed
  • .venv/bin/python -m py_compile tools/audio/tts_segment_lab.py tests/tools/test_tts_segment_lab.py

Keep TTS segment lab spoken text isolated from delivery guidance and block prompt-like retry notes before generation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant