Whisper Clone - Speech-to-Text Application

A powerful speech-to-text application that captures audio from your desktop or microphone, processes it, and injects the transcribed text into any active text field.

Features

Real-time Audio Capture: Capture audio from desktop or microphone
Speech Recognition: Convert speech to text with high accuracy
Text Injection: Automatically inject transcribed text into any active text field
Cross-Platform: Works on macOS, Windows, and Linux
Configurable: Customize audio settings, hotkeys, and more
Robust Error Handling: Comprehensive logging and fallback mechanisms

Installation

Clone the repository:

git clone https://github.com/yourusername/whisper-clone.git
cd whisper-clone

Install dependencies:
```
npm install
```
Build the application:
```
npm run build
```
Start the application:
```
npm start
```

Usage

Launch the application
Use the configured hotkey (default: Alt+Space) to start listening
Speak clearly into your microphone
The application will transcribe your speech and inject it into the active text field
Use the configured hotkey again to stop listening

Configuration

You can configure the application through the settings menu:

Audio Settings: Configure noise reduction, echo cancellation, and auto gain control
Hotkeys: Customize hotkeys for starting and stopping listening
Notifications: Enable or disable notifications and sounds
Text Injection: Configure delay and behavior for text injection

Architecture

The application is built with Electron and TypeScript, using a modular architecture:

Audio Capture: Captures audio from desktop or microphone
Audio Processing: Processes audio for optimal speech recognition
Speech Recognition: Converts audio to text
Text Injection: Injects text into active text fields

Troubleshooting

If you encounter issues:

Check the logs in the application's user data directory
Ensure microphone permissions are granted
Try restarting the application
If desktop audio capture fails, the application will automatically fall back to microphone-only mode

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

Electron - Cross-platform desktop app framework
Node.js - JavaScript runtime
TypeScript - Typed JavaScript
Web Audio API - Audio processing

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.gitignore		.gitignore
CHANGES_SUMMARY.md		CHANGES_SUMMARY.md
GITHUB_SETUP.md		GITHUB_SETUP.md
PR_DESCRIPTION.md		PR_DESCRIPTION.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Whisper Clone - Speech-to-Text Application

Features

Installation

Usage

Configuration

Architecture

Troubleshooting

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mrmoe28/whisper-clone

Folders and files

Latest commit

History

Repository files navigation

Whisper Clone - Speech-to-Text Application

Features

Installation

Usage

Configuration

Architecture

Troubleshooting

License

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages