Phonotate.App is a local, open-source Electron app built with React designed to simplify creating training data for StyleTTS 2 and voice cloning models. Phonotate provides a seamless workflow for recording, analyzing, and managing voice samples. Whether you use open-source backend services or OpenAI APIs, this app ensures that all your data remains secure and within your control.
Phonotate.App stores all your data in a local SQLite database. None of your information leaves your network unless you configure external services.
- Automatically generates AI-driven prompts using your choice of:
- OpenWeb UI (Open-source AI text generation)
- OpenAI GPT models
- Analyze recordings using Whisper ASR for transcription and accuracy checking.
- Compatible with:
- Whisper ASR Webservice (Open-source Whisper backend)
- OpenAI Whisper API
- Automatically splits samples into:
- 85% for training (
train_list.txt) - 15% for validation (
val_list.txt).
- 85% for training (
- Validation data can be phonemized using espeak for compatibility with models requiring phonetic input.
- Project Management:
- Track your projects with details like sample count, total length, and training progress.
- Tag projects with metadata like author, emotion/style, and author ID.
- Recording Workflow:
- Record, review, and analyze voice samples with live waveform visualization.
- Quickly skip, retry, or save recordings with a few clicks.
- Detailed Sample Management:
- View transcriptions, ground truth comparisons, and audio waveforms.
- Easily mark samples as good/bad with a thumbs-up/thumbs-down system.
Download the application
https://github.com/LoganRickert/Phonotate.App/releases
When opening the app for the first time, click on the settings gear icon in the top right and enter the appropriate URLs and tokens.
- OpenWeb UI for AI-generated prompts (GitHub).
- Whisper ASR Webservice for transcription (GitHub).
- Kokoro-FastAPI For text to speech generation (GitHub).
- OpenAI GPT API for prompt generation.
- OpenAI Whisper API for transcription.
- OpenAI TTS API for TTS.
- To generate phonemized validation data, you'll need an espeak API backend.
- Use the phonemization_docker included in this project:
docker run --restart unless-stopped -d -p 9712:8000 espeak-phonemization
- Use the phonemization_docker included in this project:
- View a list of projects with summaries:
- Number of good samples
- Total recording length
- Create, edit, or delete projects.
- Manage all voice samples for a project:
- View, edit, or delete samples.
- See audio waveforms, playback recordings, and compare transcriptions to ground truth.
- Generate Training Data:
- Automatically splits samples into training and validation sets.
- Outputs
train_list.txtandval_list.txtfiles to the project directory.
- Record New Samples:
- Record prompts generated by your selected AI service.
- Analyze transcription accuracy and save recordings seamlessly.
- Live waveform visualization during recording.
- Options to retry, save, or skip prompts.
- Transcription feedback to evaluate recording quality.
- Play TTS of prompt or specific word to hear how it sounds.
- Edit prompt right on page to fix issues with flow.
- Language
- Doesn't do anything right now.
- ChatGPT API URL
- The completions URL
- Ex (Open WebUI): https://chatgpt.phonotate.app/api/chat/completions
- Ex (OpenAI): https://api.openai.com/v1/chat/completions
- ChatGPT Token
- The token. This will only be used for chat completions.
- Model
- The model you want to use to generate prompts.
- Ex (Open WebUI): llama3.2:latest
- Ex (OpenAI): gpt-4o-turbo
- ASR Service URL
- For whisper, you can either use an ASR service or OpenAI. If this URL is not set, it will try and use the OpenAI value.
- Ex: https://asr.phonotate.app/asr
- OpenAI Whisper URL (If Not ASR)
- See above. Provide your token for this URL at OpenAI Whisper / TTS Token
- Ex: https://api.openai.com/v1/audio/transcriptions
- OpenAI TTS URL
- If you want to have TTS, you can insert it here. Provide your token for this URL at OpenAI Whisper / TTS Token
- Ex: https://tts.phonotate.app/v1/audio/speech
- Ex: https://api.openai.com/v1/audio/speech
- OpenAI TTS Voice
- The voice you want to use with TTS. You can use a service like Kokoro-FastAPI or OpenAI.
- Ex: af_bella
- Ex: alloy
- OpenAI Whisper / TTS Token
- The token to use for the Whisper and TTS service. If you are using ASR and Kokoro, you can leave this blank.
- Phonemization Service URL
- The URL for phonemes transcription.
- Ex: https://phonemization.phonotate.app/phonemize/
- S3 Storage Support:
- Store and retrieve files from any S3-compatible cloud storage.
- Audio Quality Enhancements:
- Normalize audio and add compression to improve recording consistency. #2
- Improved UI/UX:
- A more modern design for better usability.
- Help Page:
- In-app documentation and troubleshooting.
- Fix Electron not packing correctly
- Working with Electron has its problems. #1
-
Clone the repository:
git clone https://github.com/LoganRickert/Phonotate.App cd phonotate.app -
Install dependencies:
npm install
-
Rebuild better-sqlite3 for Electron:
- Electron uses a specific version of Node.js, so you need to rebuild native modules like
better-sqlite3to match Electronβs environment.
npm run rebuild
- Electron uses a specific version of Node.js, so you need to rebuild native modules like
-
Run the app in development mode:
npm start
-
Prepare the app for production:
- Make sure all dependencies are installed:
npm install
- Rebuild
better-sqlite3for production:npm run rebuild
- Make sure all dependencies are installed:
-
Build the app:
- This will package the app into a standalone executable for your platform (e.g.,
.exefor Windows,.dmgfor macOS):npm run package
- This will package the app into a standalone executable for your platform (e.g.,
-
Locate the build files:
- The final build will be located in the
release-buildfolder. Distribute the files from this folder as needed.
- The final build will be located in the
npm startβ Launch the app in development mode.npm run rebuildβ Rebuildbetter-sqlite3for Electronβs environment.
These instructions should help you get started with both development and building the final app for distribution! Let me know if youβd like any refinements.
Phonotate.App is licensed under the Apache License 2.0. See the LICENSE file for details.
- Voice cloning app
- Electron app for AI training
- React voice cloning
- StyleTTS training
- Whisper ASR integration
- Phonemized training data
- Open-source voice training
- Local AI tools
- Voice transcription tool
Phonotate.App is the perfect companion for anyone looking to create high-quality training data for voice cloning models like StyleTTS 2. Designed with privacy and flexibility in mind, it seamlessly integrates AI and local workflows to empower your creativity!



