A powerful speech-to-text application that captures audio from your desktop or microphone, processes it, and injects the transcribed text into any active text field.
- Real-time Audio Capture: Capture audio from desktop or microphone
- Speech Recognition: Convert speech to text with high accuracy
- Text Injection: Automatically inject transcribed text into any active text field
- Cross-Platform: Works on macOS, Windows, and Linux
- Configurable: Customize audio settings, hotkeys, and more
- Robust Error Handling: Comprehensive logging and fallback mechanisms
-
Clone the repository:
git clone https://github.com/yourusername/whisper-clone.git cd whisper-clone -
Install dependencies:
npm install -
Build the application:
npm run build -
Start the application:
npm start
- Launch the application
- Use the configured hotkey (default: Alt+Space) to start listening
- Speak clearly into your microphone
- The application will transcribe your speech and inject it into the active text field
- Use the configured hotkey again to stop listening
You can configure the application through the settings menu:
- Audio Settings: Configure noise reduction, echo cancellation, and auto gain control
- Hotkeys: Customize hotkeys for starting and stopping listening
- Notifications: Enable or disable notifications and sounds
- Text Injection: Configure delay and behavior for text injection
The application is built with Electron and TypeScript, using a modular architecture:
- Audio Capture: Captures audio from desktop or microphone
- Audio Processing: Processes audio for optimal speech recognition
- Speech Recognition: Converts audio to text
- Text Injection: Injects text into active text fields
If you encounter issues:
- Check the logs in the application's user data directory
- Ensure microphone permissions are granted
- Try restarting the application
- If desktop audio capture fails, the application will automatically fall back to microphone-only mode
This project is licensed under the MIT License - see the LICENSE file for details.
- Electron - Cross-platform desktop app framework
- Node.js - JavaScript runtime
- TypeScript - Typed JavaScript
- Web Audio API - Audio processing