Feature Request
Add FunASR/SenseVoice as an alternative to Whisper for transcription.
Why
- SenseVoice (234M params): 50+ languages, 5x faster than Whisper-small, non-autoregressive (no hallucination)
- Built-in speaker diarization (cam++): No need for separate diarization pipeline
- Complete pipeline in one package: VAD + ASR + punctuation + timestamps + speaker diarization
- OpenAI-compatible API:
funasr-server serves POST /v1/audio/transcriptions — drop-in replacement
Quick start
pip install funasr
from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall")
result = model.generate(input="audio.wav")
References
Feature Request
Add FunASR/SenseVoice as an alternative to Whisper for transcription.
Why
funasr-serverservesPOST /v1/audio/transcriptions— drop-in replacementQuick start
References