Local audio transcription application powered by OpenAI's Whisper model. Built with Tauri 2 and Rust for native performance and privacy.
- Local Processing: All transcription happens on your device - your data never leaves your computer
- Multiple Whisper Models: Choose from Tiny, Base, Small, Medium, or Large-v3-Turbo based on your needs
- Audio/Video Support: Transcribe MP3, WAV, M4A, FLAC, OGG, MP4, MOV, MKV files
- YouTube Integration: Fetch captions or transcribe with Whisper using yt-dlp
- Audio Speedup: Optional 1.25x-2.0x acceleration for faster transcription (requires ffmpeg)
- Progressive Display: See transcription results in real-time as chunks complete
- Multi-language: Auto-detect or specify the audio language (English, Spanish, French, German, Portuguese, Japanese, Chinese, and more)
- Timestamps: Optional timestamp markers in transcriptions
[HH:MM:SS] - History: Automatically saves transcriptions with metadata
- Lightweight: ~15MB app size with minimal resource usage
- macOS 10.15+ (Apple Silicon or Intel)
- ~150MB - 1.5GB disk space (depending on chosen Whisper model)
- Ubuntu 20.04+, Fedora 36+, or equivalent
- ~150MB - 1.5GB disk space (depending on chosen Whisper model)
- Build dependencies:
build-essential,cmake,libwebkit2gtk-4.1-dev,libssl-dev,libayatana-appindicator3-dev,librsvg2-dev
- Windows 10/11
- ~150MB - 1.5GB disk space (depending on chosen Whisper model)
| Tool | macOS | Linux | Windows |
|---|---|---|---|
| yt-dlp (YouTube transcription) | brew install yt-dlp |
sudo apt install yt-dlp |
winget install yt-dlp |
| ffmpeg (audio speedup) | brew install ffmpeg |
sudo apt install ffmpeg |
winget install ffmpeg |
-
Install prerequisites:
macOS:
# Install Rust curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh # Install Node.js (v18+) brew install node
Linux (Ubuntu/Debian):
# Install Rust curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh # Install system dependencies sudo apt update sudo apt install -y build-essential cmake libwebkit2gtk-4.1-dev libssl-dev libayatana-appindicator3-dev librsvg2-dev # Install Node.js (v18+) curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash - sudo apt install -y nodejs
Windows:
# Install Rust from https://rustup.rs # Install Node.js from https://nodejs.org # Install Visual Studio Build Tools with C++ workload
-
Clone and build:
git clone https://github.com/fabianhtml/audioink-rs.git cd audioink-rs npm install npm run tauri build -
The app will be available in
src-tauri/target/release/bundle/- macOS:
.dmgand.app - Linux:
.deb,.rpm, and.AppImage - Windows:
.msiand.exe
- macOS:
npm install
npm run tauri dev-
Download a Model: Open Settings and download your preferred Whisper model
- Tiny (75 MB): Fastest, lower accuracy
- Base (142 MB): Good balance for most uses
- Small (466 MB): Better accuracy
- Medium (1.5 GB): High accuracy
- Turbo (809 MB): Best quality, optimized for speed
-
Transcribe Audio:
- File Tab: Click to select an audio/video file
- YouTube Tab: Paste a YouTube URL to fetch captions
-
Options:
- Select language or use auto-detect
- Enable timestamps in Settings for time-marked output
- Enable audio speedup (1.25x-2.0x) for faster processing (note: not compatible with timestamps)
-
Export: Copy to clipboard or save as text file
audioink-rs/
├── src/ # Frontend (Vanilla JS)
│ ├── index.html
│ ├── main.js
│ └── styles.css
└── src-tauri/ # Backend (Rust)
└── src/
├── commands/ # Tauri commands (API)
├── core/ # Whisper engine, audio processing
├── models/ # Data structures
├── persistence/ # History management
└── utils/ # Error handling, helpers
- Frontend: Vanilla HTML/CSS/JavaScript
- Backend: Rust + Tauri 2
- Transcription: whisper-rs (Whisper.cpp bindings)
- Audio Decoding: Symphonia
MIT
- OpenAI Whisper - Speech recognition model
- whisper.cpp - C++ port of Whisper
- Tauri - Desktop app framework