Skip to content

feat: add MiniMax Cloud TTS provider#176

Open
octo-patch wants to merge 1 commit intojoinly-ai:mainfrom
octo-patch:feature/add-minimax-tts
Open

feat: add MiniMax Cloud TTS provider#176
octo-patch wants to merge 1 commit intojoinly-ai:mainfrom
octo-patch:feature/add-minimax-tts

Conversation

@octo-patch
Copy link
Copy Markdown

Summary

This PR adds MiniMax Cloud TTS as a first-class text-to-speech provider for joinly, complementing the existing Kokoro, ElevenLabs, and Deepgram options.

What changed

  • joinly/services/tts/minimax.pyMinimaxTTS class using the MiniMax t2a_v2 API with PCM output, so audio is consumed directly by the pipeline without additional decoding
  • README.md — Added --tts minimax to the Providers section; updated the TTS feature bullet
  • tests/services/tts/test_minimax.py — 19 tests: 7 init tests, 9 stream tests (mocked), 3 live integration tests

Usage

# Set your MiniMax API key
export MINIMAX_API_KEY=your-key

# Use MiniMax TTS with default settings (speech-02-hd, English_Graceful_Lady voice)
--tts minimax

# Choose a different model or voice
--tts minimax --tts-arg model=speech-02-turbo --tts-arg voice_id=English_Persuasive_Man

Key design choices

  • Requests PCM format (audio_setting.format: pcm) so joinly receives raw 16-bit samples — no MP3 decoding needed
  • Default sample rate 32 000 Hz (native MiniMax PCM output rate)
  • Voice auto-selected per JOINLY_LANGUAGE (English, Chinese, German, French, Spanish, Japanese, Korean)
  • Follows the same aiohttp + AsyncIterator[bytes] pattern as ResembleTTS
  • Requires MINIMAX_API_KEY environment variable; raises ValueError on startup if missing

Test plan

  • 7 unit tests for __init__ (API key guard, audio_format, voice defaults, model selection)
  • 9 stream tests: happy path, chunking, HTTP error, API error, empty audio, network error, payload structure, voice_setting structure, auth header
  • 3 live integration tests (skipped when MINIMAX_API_KEY unset, all pass with real key)
tests/services/tts/test_minimax.py - 19 passed

Add MinimaxTTS class (joinly/services/tts/minimax.py) using the MiniMax
t2a_v2 API with PCM output. The provider integrates naturally with
joinly's plug-in architecture: set MINIMAX_API_KEY and pass --tts minimax
to use it.

- Supports speech-02-hd (default) and speech-02-turbo models
- Auto-selects voice per session language; all voices configurable
- 19 unit + integration tests covering init, stream, error paths
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant