feat(dashscope): Alibaba Bailian DashScope module — CosyVoice TTS + voice cloning + Wanx#17
Merged
Merged
Conversation
CF Pages deployment API(错误码 8000111)不接受带换行的 commit message。 wrangler 默认会把 git log -1 的 subject+body 整块发过去,于是带 body 的提交 就会 deploy 失败。错误信息称"not valid UTF-8"是误导——实际拒的是换行符。 加了坑 3 完整说明 + 两种修法(手动 --commit-message "$(git log -1 --pretty=%s)" 或在项目里写 scripts/deploy.sh 封装)。错误调试速查表也补了 8000111 + 8000007。 发现来源:classics-learning 项目 deploy 时踩到,单行中文 commit 没事, 本次 commit 带了 body 才暴露。
… + voice cloning + Wanx Adds `library/dashscope/` covering the non-LLM half of Alibaba's DashScope platform (Qwen LLM is intentionally NOT in scope here — that lives in the separate `qwen` module which targets the OpenAI-compatible chat-completions front; same physical API key, different module per SPEC §0 flat-no-inheritance rule). Live verification (2026-05-17) — tier `partial` - Installed dashscope==1.25.18 in a tmp venv - DASHSCOPE_API_KEY (sk-...) sourced from ~/.trove/dashscope/credentials.json - Submitted CosyVoice v3-flash TTS task via official Python SDK (`dashscope.audio.tts_v2.SpeechSynthesizer`) with voice=longxing_v3, text="trove smoke 你好" - WebSocket connection opened to wss://dashscope.aliyuncs.com/api-ws/v1/inference - Bearer header accepted; run-task event acknowledged; task_id issued - Runtime then blocked: `task-failed` event with `error_code: "Arrearage"`, message `Access denied, please make sure your account is in good standing.` Account balance ≤ 0 - Auth + endpoint + request schema verified end to end; only the billing gate stops actual audio bytes. Same partial-tier shape as the existing kling module entry Module shape (10 Critical Constraints up front, gotchas-first) 1. Account funding gate (Arrearage discovered during smoke) 2. CosyVoice is WebSocket-only, not REST sync — debunks confusing REST-style examples in older docs which were CosyVoice v1 batch 3. Region split (China dashscope.aliyuncs.com vs intl dashscope-intl.aliyuncs.com) with misleading InvalidApiKey error 4. Auth header is `Authorization: Bearer`, NOT legacy `X-DashScope-API-Key:` (pre-2024 form rejected on v3) 5. Voice ID is model-version-locked (longxing_v3 only on v3-*; longxiaochun only on v2) 6. Character billing counts full-width punctuation; cost meter is in `usage.input_tokens` (misleading field name for a TTS product) 7. Voice cloning is free; only synthesis charges 8. Watermark setting is account-wide (console), not per-call 9. Wanx image gen uses task-poll pattern (different from CosyVoice's WebSocket pattern), same auth 10. No official Node SDK — raw WebSocket for non-Python runtimes Body sections - Setup (`pip install 'dashscope>=1.25'`) - Quickstart: CosyVoice TTS via Python SDK with usage.input_tokens inspection - Voice catalogue (representative 7 voices, generic gender/style framing, link to full 150+ list — earlier private-fork's "短剧专用 (short drama)" framing genericized) - Voice cloning recipe (VoiceEnrollmentService → custom voice_id → reuse in any SpeechSynthesizer call) - Wanx image gen example (ImageSynthesis.call, presigned 24h URL) - Raw WebSocket sketch for Node/Edge/Deno (no official SDK exists) - Cost estimation table (TTS/clone/Wanx prices, char-counting rule) - 6-row error reference incl. the Arrearage we discovered live - Cross-module pointer to `qwen` for LLM access via the same key - Source of truth (9 upstream URLs + lastmod) Library bookkeeping - 20 → 21 modules - 5 prod · 14 verified · 1 partial → 5 prod · 14 verified · 2 partial (dashscope joins kling at the partial tier) - Site module grid: dashscope added under media-generation alongside seedance / seedream / kling / fal-ai Privacy - Earlier private-fork of this module had `Moment Stream` and `ADR-0021` references in the description (maintainer's downstream project + internal architecture decision record). Both stripped before OSS commit. Description reframed around the actual user- facing scope: CosyVoice TTS + voice cloning + Wanx image gen - Pre-commit hook PRIVATE_RE scan: clean on staged diff Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds `library/dashscope/` covering the non-LLM half of Alibaba's DashScope platform — CosyVoice TTS, voice cloning, Wanx image gen. Qwen LLM continues to live in the separate `qwen` module (same physical API key, different module per SPEC §0).
Live verification — tier `partial`
Same tier as kling: auth + contract verified, runtime blocked by billing. `last_verified` records this faithfully.
Module highlights
10 Critical Constraints (gotchas-first per SPEC §2.1):
Sections: Setup, Python SDK quickstart, voice catalogue (representative 7 voices, generic framing, link to full 150+), voice cloning recipe, Wanx image gen example, raw WebSocket sketch for Node/Edge, cost table, 6-row error reference, cross-module pointer to `qwen`, source-of-truth URLs.
Bookkeeping
Privacy
Earlier private-fork of this module body had `Moment Stream` + `ADR-0021` references (maintainer's downstream project + internal ADR). Both stripped. Pre-commit hook scan: clean on staged diff.
🤖 Generated with Claude Code