feat(tts): add Azure AI Speech provider by RayJiang4S · Pull Request #86 · calesthio/OpenMontage

RayJiang4S · 2026-05-22T05:20:41Z

Summary

Adds an Azure AI Speech text-to-speech provider for OpenMontage.

Highlights:

REST synthesis via Azure AI Speech with AZURE_SPEECH_KEY and AZURE_SPEECH_REGION
operation=list_voices for voice catalog discovery
SSML generation with voice, style, role, rate, pitch, volume, and sentence silence controls
Custom endpoint and custom voice deployment support
Optional SDK mode for word-boundary timing metadata when azure-cognitiveservices-speech is installed
TTS selector compatibility through the existing capability/provider registry
Documentation and provider contract coverage

Validation

Passed:

.venv/bin/python -m pytest -q tests/tools/test_azure_tts.py tests/contracts/test_phase3_contracts.py
# 71 passed

Also validated manually with a live Azure Speech resource:

provider status: available
operation=list_voices for zh-CN: 49 voices returned
direct SDK synthesis succeeded
SDK word-boundary metadata returned 13 boundaries / 11 words for a Chinese sample
tts_selector routed preferred_provider=azure to azure_tts and generated audio

Full-suite note:

.venv/bin/python -m pytest -q

currently stops during collection in tests/qa/test_08_end_to_end.py because its fixture does not include the now-required render_runtime property for edit_decisions. That appears unrelated to this Azure provider change.

Add an Azure AI Speech TTS provider with REST synthesis, voice catalog discovery, SSML controls, custom endpoint support, and optional SDK word-boundary metadata. Document setup and include provider contract coverage.

Add Azure preflight metadata and REST fallback when word-boundary timing is requested but the optional SDK is missing, unless callers explicitly require word boundaries.

RayJiang4S · 2026-05-22T17:17:27Z

Updated this PR with a robustness fix from the multi-provider TTS audition workflow.

What changed:

Added operation: \"preflight\" to report Azure credential/region status, SDK availability, word-boundary intent, and fallback behavior before batch generation.
Added require_word_boundaries; when false, Azure now falls back to REST synthesis if enable_word_boundaries requested the optional SDK but the SDK is unavailable.
REST fallback records word_boundary_fallback: \"rest_without_word_boundaries\", empty words/boundaries, and a warning in output metadata so review pages can explain why timing metadata is missing while still producing audio.
Documented the fallback and preflight workflow in docs/PROVIDERS.md.

Validation:

.venv/bin/python -m pytest tests/tools/test_azure_tts.py -q passed: 10 tests.
.venv/bin/python -m pytest tests/contracts/test_phase3_contracts.py -q passed: 63 tests.
.venv/bin/python -m py_compile tools/audio/azure_tts.py tests/tools/test_azure_tts.py passed.
git diff --check passed.

feat(tts): add Azure AI Speech provider

524ff5a

Add an Azure AI Speech TTS provider with REST synthesis, voice catalog discovery, SSML controls, custom endpoint support, and optional SDK word-boundary metadata. Document setup and include provider contract coverage.

RayJiang4S requested a review from calesthio as a code owner May 22, 2026 05:20

fix(tts): fall back when Azure SDK timing is unavailable

13177e0

Add Azure preflight metadata and REST fallback when word-boundary timing is requested but the optional SDK is missing, unless callers explicitly require word boundaries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tts): add Azure AI Speech provider#86

feat(tts): add Azure AI Speech provider#86
RayJiang4S wants to merge 2 commits into
calesthio:mainfrom
RayJiang4S:ray/azure-ai-speech-tts-provider

RayJiang4S commented May 22, 2026

Uh oh!

RayJiang4S commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RayJiang4S commented May 22, 2026

Summary

Validation

Uh oh!

RayJiang4S commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant