The current Vosk models provide decent recognition but fall noticeably behind modern alternatives in accuracy, especially for accents and noisy environments.
Request: Add support for whisper.cpp as an alternative speech recognition backend.
Why Whisper?
Significantly better accuracy than Vosk (especially Whisper small/medium/large)
whisper.cpp is optimised for on-device inference and runs well on mobile
Already proven on Android via FUTO Voice Input
Multi-language support built-in
This could be implemented as:
A separate backend option alongside Vosk
Or a complete migration to whisper.cpp
Related: #97 (Moonshine request)
Would love to hear if this is feasible or if there are blockers. Happy to help test.
The current Vosk models provide decent recognition but fall noticeably behind modern alternatives in accuracy, especially for accents and noisy environments.
Request: Add support for whisper.cpp as an alternative speech recognition backend.
Why Whisper?
Significantly better accuracy than Vosk (especially Whisper small/medium/large)
whisper.cpp is optimised for on-device inference and runs well on mobile
Already proven on Android via FUTO Voice Input
Multi-language support built-in
This could be implemented as:
A separate backend option alongside Vosk
Or a complete migration to whisper.cpp
Related: #97 (Moonshine request)
Would love to hear if this is feasible or if there are blockers. Happy to help test.