Fully local, open-source dictation for macOS Apple Silicon.
Hold a key, talk, and watch your words land — cleaned up, punctuated, and formatted — in any text field on your Mac.
Your audio, transcripts, and corrections never leave your computer.
Built by Ziplyne
Quick start · vs. Wispr Flow · Models · Features · Troubleshooting
Witzper is a macOS app that turns your voice into polished, ready-to-send text. You hold a key (default: Fn), speak normally, release the key — and your words appear in whatever you're writing: iMessage, Slack, Gmail, Notion, VS Code, anywhere. Punctuation, capitalization, and tone are handled automatically by a local language model that knows which app you're writing in and how you personally write.
It's built to match or beat Wispr Flow on latency and transcript quality — without sending a single byte of your audio to the cloud.
| Witzper | Wispr Flow | |
|---|---|---|
| Where your voice goes | Stays on your Mac. Always. | Cloud (their servers). |
| Cost | Free. MIT licensed. | Subscription. |
| Works offline | ✅ Yes | ❌ No |
| Model quality | Swappable local cleanup models up to Qwen3 30B-A3B | Proprietary cloud model |
| Personalization | Local dictionary, snippets, edit watcher, and manual LoRA export on your edits | Global model, tweaks via dictionary only |
| Swappable models | ✅ Pick any of 7 cleanup LLMs + 4 ASR models | ❌ Locked |
| Voice snippets | ✅ Local trigger → expansion snippets | Plain snippets only |
| Telemetry | None. Ever. | Opt-out analytics |
| Source code | Open | Closed |
The cost: you need a recent Apple Silicon Mac with enough unified memory for the model you choose. The default setup targets 16 GB+ Macs. See system requirements.
Three ways to get Witzper running. Pick one.
One line. This is how most people should install Witzper.
brew install --cask ZipLyne-Agency/witzper/witzperHomebrew strips the com.apple.quarantine xattr automatically, so macOS Gatekeeper won't block first launch even though Witzper isn't notarized. After install, open Witzper from /Applications and grant Microphone, Accessibility, and Input Monitoring under System Settings → Privacy & Security.
Upgrade later with:
brew upgrade --cask witzper(Or use the in-app Check for Updates… menu item — see 🔄 Updating Witzper.)
Grab the latest Witzper-X.Y.Z.dmg (drag-to-install disk image) or Witzper-X.Y.Z.zip. After unzipping or mounting, drag Witzper.app into /Applications.
xattr -dr com.apple.quarantine /Applications/Witzper.appThen double-click Witzper.app. This is not needed if you install via Homebrew (option 1 above) — brew does it for you.
Useful if you want to hack on Witzper or verify the build yourself. Requires Homebrew, Python 3.13, Xcode command-line tools, and uv. Full instructions in Quick start (build from source) below.
For contributors and developers. If you just want to use Witzper, see 📦 Install above —
brew install --cask witzperis the one-liner. The steps below are for building from source.
Open the Terminal app (⌘-space → type "Terminal" → Enter). Paste each line and press Enter:
# 1. Install Homebrew (the Mac package manager). Skip if you already have it.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# 2. Install the three tools Witzper needs.
brew install python@3.13 ffmpeg git
# 3. Install Xcode command-line tools (needed to build the menu-bar helper).
xcode-select --installgit clone https://github.com/ZipLyne-Agency/Witzper.git ~/Witzper
cd ~/Witzper
./scripts/setup.shThis creates a Python virtual environment, installs every Python dependency, and builds the Swift menu-bar helper.
./scripts/download_models.shBy default this pulls the Parakeet speech-recognition model and the Qwen3-8B 4-bit cleanup model. Heavier cleanup models can be selected later from the dashboard if your Mac has enough memory.
On first launch Witzper will ask for three macOS permissions. All three are required:
- Microphone — to record speech.
- Accessibility — to read focused-field context and insert the final transcript.
- Input Monitoring — to capture the global push-to-talk hotkey. Dashboard manual recording does not need this, but global hotkeys do.
For each prompt, macOS will open System Settings → Privacy & Security. Flip the toggle next to Witzper, then relaunch.
./scripts/run.shThe first time you run this, it'll ask you to pick a push-to-talk hotkey (default Fn). Then you'll see the Witzper waveform icon in your menu bar. Hold Fn, say "hey are you free for lunch tomorrow", release. It should type back Hey are you free for lunch tomorrow? into whatever field you have focused. You can also open the dashboard and click Record / Stop to dictate without using a hotkey.
That's it. Stop the daemon with ⌃C in the terminal.
💡 Tip: Want Witzper to show up as a proper Mac app (so macOS treats it like any other app for permissions)? Run
./scripts/build_app.shand dragbuild/Witzper.appto/Applications.
Witzper runs real language models in unified memory. How much RAM you have determines which models you can use.
| Your Mac's RAM | Recommended cleanup model | What you get |
|---|---|---|
| 16 GB | Qwen3 4B (4-bit) or Llama 3.2 3B | Fast, decent quality, 80 ms cleanup |
| 32 GB | Qwen3 14B (8-bit) | Near-flagship quality, 180 ms cleanup |
| 64 GB+ | Qwen3 30B-A3B (8-bit) | Witzper's high-quality mode, 250 ms cleanup |
- Mac with Apple Silicon (M1, M2, M3, M4, M5 — not Intel)
- macOS 14.0 Sonoma or newer
- Python 3.11+ (3.13 recommended) for source builds. The release app bundles its own Python runtime.
- ffmpeg for source builds and local development utilities.
- Xcode command-line tools for source builds.
- ~10 GB free disk for the default model set (more if you choose heavier models)
- Internet for the first run only (model download). After that: fully offline.
| Component | RAM |
|---|---|
| Qwen3-8B cleanup LLM (4-bit) | ~5 GB |
| Parakeet TDT 0.6B v3 (ASR) | ~1 GB |
| Silero VAD | ~50 MB |
| Witzper menu-bar app + dashboard | ~80 MB |
| Python daemon overhead | ~3 GB |
| Total hot path | ~9 GB |
Zero telemetry. Zero cloud calls. The only network activity is the first-run model download from Hugging Face Hub. [telemetry] enabled = false is the only supported value.
Witzper has three model roles: an ASR (speech-to-text), a Cleanup LLM, and an optional Command Mode LLM. Every model is swappable — either by editing ~/.config/Witzper/config.toml or by using the built-in dashboard picker.
Runs on every utterance. Takes the raw transcript and fixes grammar, punctuation, and applies your per-app Flow Style. Latency matters here.
| Model | Label | RAM | Latency | Quality | Best for |
|---|---|---|---|---|---|
juanquivilla/sotto-cleanup-lfm25-350m-mlx-4bit |
Sotto Cleanup 350M | 0.2 GB | ~30 ms | ★★★★ | Any Mac — purpose-built for transcript cleanup |
mlx-community/Llama-3.2-1B-Instruct-4bit |
Llama 3.2 1B | 0.7 GB | ~40 ms | ★★★ | Oldest M1 Macs |
mlx-community/Llama-3.2-3B-Instruct-4bit |
Llama 3.2 3B | 2 GB | ~70 ms | ★★★ | 8–16 GB Macs |
mlx-community/Qwen3-4B-Instruct-2507-4bit |
Qwen3 4B | 2.5 GB | ~80 ms | ★★★★ | Recommended for 16 GB Macs |
mlx-community/Qwen3-8B-4bit |
Qwen3 8B (default) | 5 GB | ~120 ms | ★★★★ | 16–24 GB Macs |
mlx-community/Qwen3-14B-Instruct-2507-8bit |
Qwen3 14B | 15 GB | ~180 ms | ★★★★★ | Recommended for 32 GB Macs |
mlx-community/Qwen3-30B-A3B-Instruct-2507-8bit |
Qwen3 30B-A3B | 32 GB | ~250 ms | ★★★★★ | Recommended for 64 GB+ Macs — 30B MoE with only 3B active params |
| Model | Label | RAM | Latency | Languages | Notes |
|---|---|---|---|---|---|
mlx-community/parakeet-tdt-0.6b-v3 |
Parakeet TDT 0.6B v3 (default) | 1 GB | ~80 ms | 25 | 10× faster than Whisper Large v3 with lower WER |
mlx-community/whisper-large-v3-turbo |
Whisper Large v3 Turbo | 3 GB | ~200 ms | 100+ | Use for languages Parakeet doesn't cover |
mlx-community/whisper-large-v3-mlx |
Whisper Large v3 (full) | 6 GB | ~500 ms | 100+ | Highest quality, slowest |
mlx-community/whisper-medium-mlx |
Whisper Medium | 1.5 GB | ~250 ms | English-focused | Middle-ground Whisper |
mlx-community/Qwen3-ASR |
Qwen3-ASR (accuracy mode — experimental) | ~4 GB | 150–300 ms | Multi | Accepts a text context prompt for Wispr-style context-conditioned recognition. Requires the community MLX port — see Experimental features. |
ASR mode selection ([asr] mode in config):
speed— always Parakeet (no context prompt).accuracy— always Qwen3-ASR with full context injection.auto— per-app, driven byconfigs/app_rules.toml(e.g. accuracy in Mail, speed in iMessage).
A separate hotkey (default right_cmd+right_option) triggers Command Mode for transformations like "rewrite this as an email", "translate to Spanish", or "restructure as bullets".
| Model | Label | RAM | Notes |
|---|---|---|---|
mlx-community/Qwen3-14B-Instruct-2507-4bit |
Qwen3 14B (light) | 8 GB | For 32 GB Macs |
mlx-community/Qwen3-8B-4bit |
Qwen3 8B (shared — default) | 0 GB extra | Reuses the cleanup model already in memory |
Note: Command Mode is wired to the separate command hotkey, but remains experimental while the transform UX is tuned.
- VAD: Silero (default, unauthenticated) or
pyannote/segmentation-3.1(requires HF license accept). - Few-shot retrieval uses local SQLite corrections with RapidFuzz token matching. No embedding model or transformer runtime is loaded for retrieval.
- 🎙 Push-to-talk dictation — hold Fn, a typed character, a function key, Right ⌥ / Right ⌘ / Right ⇧ / Caps Lock, or a modifier chord; speak; release.
- 🖱 Manual dashboard recording — click Record / Stop in the dashboard to record and transcribe without using a hotkey.
- 🧠 Local LLM cleanup — a swappable local MLX model fixes grammar, punctuation, and disfluencies on every utterance.
- 📡 Streaming partial transcripts in the HUD — see your words appear live as you speak, with the exact text the model currently hears.
- ⚡ Pre-flight ASR — the speech model processes audio while you're still talking, so by the time you release the hotkey the raw transcript is already done. End-to-end latency roughly halved.
- 🎯 Context-aware ASR (accuracy mode) — the model is given your focused app, window title, surrounding text, and personal vocabulary so it nails proper nouns.
- 🔀 Auto mode switching — Witzper picks speed (Parakeet) or accuracy (Qwen3-ASR) per-app via
configs/app_rules.toml. - 🛡 Hallucination guardrail — if the cleanup LLM produces output that's too long or too different from the raw transcript, Witzper falls back to the raw text.
- 📋 Smart insertion — clipboard-paste in normal apps, synthesized keystrokes for terminals and password fields. Clipboard contents are restored after insertion.
- 📖 Personal dictionary — names, acronyms, jargon. Boosts ASR accuracy and powers deterministic
wrong → rightreplacements. - 🪄 Voice snippets — say "my address", get
123 Main St. Case-insensitive whole-word matching. - 🎨 Flow Styles per app category — Casual / Formal / Very Casual / Excited, mapped independently to Personal Messages / Work Messages / Email / Other.
- 👀 Edit watcher — Witzper watches the focused field for 10 s after inserting text. Any edit you make becomes a training pair for your personal LoRA.
- 🔁 Few-shot retrieval — RapidFuzz finds the top-5 most similar past corrections from local SQLite and feeds them to the cleanup LLM as in-context examples. Quality improves from day one with zero training.
- 🌙 Manual cleanup LoRA export/training —
flow train cleanupcan fine-tune a rank-16 cleanup adapter once you have enough correction pairs. Automatic scheduling/hot-swap is not enabled in this release. - 🎤 ASR LoRA manifest export (scaffolded) —
flow train asrwrites(audio → transcript)pairs for the Qwen3-ASR MLX trainer once that trainer is available. - 📊 Automatic dictionary learning — single-token edits within edit distance 2 append to your boost dictionary automatically.
Open via the menu-bar icon or flow doctor. Tabs:
- Live — real-time transcript stream with per-stage latency (VAD / ASR / LLM / total).
- Dictionary — add, remove, browse boost words and replacement rules.
- Snippets — manage voice-snippet expansions.
- Settings — swap models (cleanup / ASR / command) from the catalog, type the character or key you want as your hotkey, choose a microphone, set Flow Styles.
- 🔴 Floating HUD pill — pulsing red dot + status + live partial transcript, always on top, works across Spaces.
- 🔕 Menu-bar icon — waveform glyph that highlights while recording.
- 🔊 Audio feedback — start/stop tones (configurable).
- ✅ Built-in diagnostics —
flow doctorand the menu's "Show diagnostics…" dialog check permissions, models, and sockets.
- No telemetry, ever. The
[telemetry] enabled = falseknob is hardcoded to false. - All state local:
~/.local/share/Witzper/(SQLite + audio),~/.config/Witzper/config.toml(settings),~/.cache/huggingface/hub/(model weights). - IPC sockets (
/tmp/Witzper.sock,/tmp/flow-context.sock,/tmp/flow-stream.sock) are Unix-domain, mode 0600.
After ./scripts/setup.sh, the flow command lives inside .venv:
source .venv/bin/activate # or let run.sh do this for you| Command | What it does |
|---|---|
flow run [-v] |
Start the dictation daemon. -v prints per-stage latency for every utterance. |
flow setup |
Interactive hotkey picker (first-run wizard). |
flow doctor |
Check models, permissions, Swift helper, audio devices. |
flow dict --add <word> |
Add a vocabulary boost word. |
flow dict --replace 'wrong=right' |
Add a deterministic replacement rule. |
flow dict --remove <term> |
Remove a boost word or replacement. |
flow dict --list |
Show the full dictionary. |
flow snippet --add '<trigger>' --text '<expansion>' |
Add a voice snippet. |
flow snippet --remove '<trigger>' |
Remove a snippet. |
flow snippet --list |
Show all snippets. |
flow style |
Show current Flow Styles per app category. |
flow style <category> <name> |
Set a style. Categories: personal_messages, work_messages, email, other. Styles: formal, casual, very_casual, excited. |
flow train cleanup |
Manually run a LoRA fine-tune of the cleanup LLM. Needs ≥20 correction pairs. |
flow train asr |
Export an ASR-training manifest (experimental). |
Defaults live in configs/default.toml. To override anything, create ~/.config/Witzper/config.toml with only the keys you want to change.
[hotkey]
key = "fn" # legacy fallback; new installs use [hotkeys.*]
toggle_mode = false
[hotkeys.dictate]
key = "fn" # typed char ("a"), standalone ("space", "f5"), modifier, or chord
mode = "hold"
[hotkeys.command]
key = "right_cmd+right_option"
mode = "hold"
[audio]
sample_rate = 16000
channels = 1
device = "default" # or an exact mic name from "Show diagnostics…"
max_seconds = 120
[vad]
backend = "silero" # "silero" | "pyannote"
endpoint_silence_ms = 700
[asr]
mode = "auto" # "speed" | "accuracy" | "auto"
streaming = true # live partial transcripts in HUD (IDEAS #1)
streaming_interval_ms = 350
streaming_min_audio_ms = 350
streaming_reuse_ratio = 0.95 # reuse partial instead of re-transcribing on key-up (IDEAS #2)
[asr.speed]
model = "mlx-community/parakeet-tdt-0.6b-v3"
backend = "parakeet-mlx"
[asr.accuracy]
model = "mlx-community/Qwen3-ASR"
backend = "qwen3-asr-mlx"
context_prompt_max_tokens = 1024
[cleanup]
model = "mlx-community/Qwen3-8B-4bit"
max_tokens = 96
temperature = 0.0
few_shot_n = 5
max_length_ratio = 1.8
max_edit_distance_ratio = 0.5
[command]
enabled = true
model = "mlx-community/Qwen3-8B-4bit"
max_tokens = 2048
hotkey = "" # legacy; command hotkey lives in [hotkeys.command]
[insertion]
default_strategy = "paste" # "paste" | "type"
restore_clipboard_after_ms = 200
[personalization]
auto_add_to_dictionary = true
edit_watch_window_seconds = 10
cleanup_lora_enabled = false
cleanup_lora_rank = 16
cleanup_lora_schedule_cron = "0 3 * * *"
asr_lora_enabled = false
asr_lora_rank = 8
asr_lora_schedule_cron = "0 4 */14 * *"
dspy_enabled = false
[styles]
personal_messages = "casual"
work_messages = "casual"
email = "casual"
other = "casual"
[snippets]
case_insensitive = true
strip_trailing_punct_on_solo_trigger = true
[telemetry]
enabled = false # do not changeHotkey (Swift CGEventTap)
│
▼
Audio capture (16 kHz mono)
│
▼
Streaming partial loop ◄── feeds HUD live ──► HUD (Swift, floating pill)
│
▼
VAD (Silero) — endpoint trim
│
▼
ASR (Parakeet / Qwen3-ASR / Whisper)
▲ ▲
│ └── AX context (app, window, surrounding text)
└── personal dictionary boost terms
│
▼
Few-shot retriever (RapidFuzz + SQLite)
│
▼
Cleanup LLM (Qwen3-8B default, swappable via MLX)
+ Flow Style instruction per app category
+ top-5 few-shots
+ alt-hypothesis rerank
│
▼
Hallucination guardrail (fall back to raw if exceeded)
│
▼
Dictionary replace → Snippet expansion
│
▼
Inserter (clipboard paste or keystroke)
│
▼
Edit watcher (captures corrections → manual LoRA export/training)
See docs/architecture.md for the full walkthrough.
| Mode | Components | End-to-end p50 |
|---|---|---|
| Speed | Parakeet + Qwen3-8B | 250–500 ms |
| Accuracy | Qwen3-ASR + Qwen3-8B | 400–800 ms |
| Command Mode | Qwen3-8B (shared) | 1–4 s |
Per-stage budget:
| Stage | Target |
|---|---|
| VAD endpoint trim | ≤ 30 ms |
| Parakeet ASR | 40–80 ms |
| Qwen3-ASR | 150–300 ms |
| Cleanup LLM (Qwen3-8B) | 100–250 ms |
| Insertion | ≤ 10 ms |
With streaming + pre-flight ASR (IDEAS #1 + #2), the perceived latency is closer to (cleanup LLM) + (insertion) because the ASR work overlaps with speech.
These features live in the codebase but are not fully wired in the current release. Clearly marked so nobody gets surprised:
- Qwen3-ASR accuracy mode — the wrapper (
flow/models/qwen3_asr.py) is ready, but the communityqwen3-asr-mlxpackage is still stabilizing. Until it's on PyPI, Witzper falls back to Parakeet and ignoresasr.mode = "accuracy". Track progress in IDEAS.md. - Command Mode — wired to the separate command hotkey, but still marked experimental because quality and UX are still being tuned.
- ASR LoRA training —
flow train asrexports a manifest for the Qwen3-ASR MLX trainer; actual training integration is pending that same port. - DSPy prompt optimization — dependency is installed under
[personalize]extras, runner is not yet invoked from cron. - Signed DMG installer — release packaging builds a ZIP/DMG, but the app is currently ad-hoc signed rather than Developer ID notarized.
See IDEAS.md for the full backlog with priorities.
You're missing Accessibility and/or Input Monitoring permission. Click the Witzper menu-bar icon → Open Accessibility Settings…, flip Witzper on, then Open Input Monitoring Settings… and do the same. You must quit and relaunch Witzper after granting permissions — macOS caches the old state. If you only want to test recording, open the dashboard and use Record / Stop; that path still needs Microphone and Accessibility but not Input Monitoring.
Manual recording needs Microphone access to capture audio and Accessibility access to insert the transcript. It intentionally does not require Input Monitoring because it does not listen for a global keypress.
You didn't activate the venv. Run source .venv/bin/activate from the repo root, then try again. Or just use ./scripts/run.sh which activates it for you.
A stale daemon. Kill it with kill 1234 (use the pid shown) or pkill -f 'python -u -m flow run'.
Your cleanup model is too big for your RAM. Open ~/.config/Witzper/config.toml and swap to a smaller model from the catalog, e.g.:
[cleanup]
model = "mlx-community/Qwen3-4B-Instruct-2507-4bit"Then re-run ./scripts/download_models.sh (it'll skip what you already have).
For pyannote VAD you need an HF account and to accept the license on the model page. Witzper silently falls back to Silero if pyannote auth fails, so this is optional. For the Qwen models no auth is needed — they're on mlx-community.
Run (cd swift-helper && swift build -c release). You'll need Xcode command-line tools (xcode-select --install).
flow doctorOr click the menu-bar icon → Show diagnostics…
Witzper/
├── flow/ # Python inference daemon
│ ├── __main__.py # `flow` CLI
│ ├── config.py # TOML + pydantic
│ ├── core/ # orchestrator, audio, VAD, hotkey, doctor, setup wizard
│ ├── models/ # ASR (Parakeet, Whisper, Qwen3-ASR) + cleanup + command LLMs
│ ├── context/ # app context, dictionary, few-shot retriever, styles
│ ├── insert/ # clipboard + keystroke inserter
│ ├── personalize/ # correction store, edit-watch, snippets, LoRA trainer
│ └── ui/ # HUD pill + event stream
├── swift-helper/ # Menu-bar app, HUD, dashboard
│ └── Sources/FlowHelper/
│ ├── main.swift # CGEventTap, menu bar, Unix sockets, stream listener
│ ├── HUD.swift # floating pill w/ live partial transcript
│ ├── Dashboard.swift # SwiftUI Live / Dict / Snippets / Settings
│ ├── ModelCatalog.swift # the swappable model list
│ ├── ModelPickerView.swift, SettingsView.swift, DictionaryView.swift, SnippetsView.swift
│ ├── Inserter.swift, SQLiteStore.swift, Sounds.swift
├── configs/
│ ├── default.toml # all defaults
│ ├── app_categories.toml # app → Flow Style category rules
│ └── app_rules.toml # app → ASR mode rules (auto)
├── scripts/
│ ├── setup.sh # venv + deps + Swift build
│ ├── download_models.sh # HF prefetch
│ ├── run.sh # launch helper + daemon
│ ├── build_app.sh # package Witzper.app
│ ├── build_icon.py # .icns generator
│ ├── test_pipeline.py # end-to-end smoke test
│ └── train_nightly.sh # optional manual cron wrapper for LoRA training
├── docs/architecture.md
├── assets/ # app icon + iconset
├── IDEAS.md # feature backlog
├── TODO.md # build-out order
├── README.md
└── LICENSE # MIT
Witzper has a built-in updater. You never need to visit GitHub to stay current.
- Automatic: every 24 hours when Witzper.app launches, it silently checks github.com/ZipLyne-Agency/Witzper/releases/latest. If a newer version is available, a prompt appears asking to install it. One click, ~10 seconds, the app relaunches on the new version.
- Manual: menu bar → Check for Updates… — forces a check right now, even if you just ran one. If you're already on the latest, it tells you so.
- Under the hood: the updater fetches a
latest.jsonmanifest produced by GitHub Actions on each tagged release, downloads the.zipasset, verifies the SHA-256, replaces/Applications/Witzper.app(the old bundle is moved to the Trash, not deleted), and relaunches. Source:swift-helper/Sources/FlowHelper/Updater.swift. - Opt out: the silent check respects your privacy and never sends analytics. If you want to disable it entirely, remove the
Updater.checkSilently()call frommain.swiftand rebuild.
Releases are tag-triggered. To ship a new version:
scripts/release.sh patchThat script validates the current state, bumps VERSION and pyproject.toml, commits, creates an annotated version tag, and pushes everything. The push triggers .github/workflows/release.yml, which on a macos-14 runner builds Witzper.app in release mode, zips it with ditto, builds a DMG, computes SHA-256 checksums, and creates a GitHub Release containing:
Witzper-X.Y.Z.zip— the downloadable app bundle.Witzper-X.Y.Z.dmg— the drag-to-install disk image.latest.json— the manifest the in-app Updater reads.- Auto-generated release notes from the commits since the last tag.
Users who already have Witzper installed see the update within 24 hours (or instantly via Check for Updates). There's no App Store, no notarization, no Sparkle appcast to maintain. Just tag and push.
Witzper is MIT-licensed and contributions are welcome. Good starter tasks are tagged in IDEAS.md — especially the ones flagged "High-impact, low effort". Open a PR or issue on GitHub.
Run the smoke test before submitting:
python scripts/test_pipeline.pyAfter installing and granting macOS permissions, run the live E2E harness:
python scripts/live_e2e.pyIt preflights Accessibility, Input Monitoring, and Microphone before recording,
then exercises real microphone capture, ASR, cleanup, Swift-helper insertion, and
focused-field readback verification.
By default it asks you to say hey are you free for lunch tomorrow and fails if
the cleaned transcript is not similar enough. It opens a scratch .txt file in
TextEdit as the insertion target and drives the packaged daemon through the
Swift helper, so recording happens in the same app process users run. Pass
--target focused to use the current field instead. Use --mode direct for the
older harness-process capture path, and --expect "your phrase" or
--config path/to/config.toml for alternate test runs. If preflight blocks,
--open-permissions opens the missing System Settings panes.
MIT © 2026 Isaac Horowitz. See LICENSE.
Built by Ziplyne.
Built with MLX, parakeet-mlx, mlx-lm, sentence-transformers, Silero VAD, and pyannote.audio. Witzper would not exist without any of them.
Model weights are downloaded on demand from each upstream author. Please review and comply with each model's license before redistribution:
| Model | Upstream | License |
|---|---|---|
| Qwen3 family (4B / 8B / 14B / 30B-A3B) | Qwen team | Apache 2.0 / Qwen Research License |
| Parakeet TDT 0.6B v3 | NVIDIA NeMo | CC BY 4.0 |
| Whisper Large v3 / Turbo | OpenAI | MIT |
| Silero VAD | snakers4 | MIT |
| pyannote segmentation 3.1 | pyannote | MIT, gated — requires HF auth + license accept |
| MLX quantizations | mlx-community | Inherit upstream |
