Isolate Parakeet in a subprocess so Metal crashes don't kill Flask (issue #23)#32
Merged
Merged
Conversation
…ssue #23) The v3.4.1 mitigation (60s chunks + mx.clear_cache) reduces but cannot eliminate the SIGABRT path: mlx::core::gpu::check_error raises a C++ exception inside Metal's addCompletedHandler, which has no catch block, so it unwinds straight to std::terminate → abort(). Python can't trap that. The whole interpreter dies, Flask dies with it, and the user sees a generic "Load failed" page from the browser. Move the chunked Parakeet decode into parakeet_worker.py and spawn it via subprocess.Popen([sys.executable, ...]). When MLX aborts the child, the parent observes a nonzero returncode, raises RuntimeError, and the outer transcribe_file falls through to the existing WhisperX / Whisper fallback path — Flask stays up, the user gets a real transcript via a slower engine instead of a dead server. Trade-off: each Parakeet call now reloads the ~600 MB model (~5–15 s of overhead per file) since there's no parent-side model cache anymore. That's acceptable for the one-file-at-a-time workflow the issue reporter is using; a long-lived worker with a request pipe is a follow-up if batch throughput becomes a concern. The 60s chunks and mx.clear_cache stay (now inside the worker). Smaller chunks make the worker itself less likely to crash and need restarting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
doza-assist | e6ec6fa | May 25 2026, 01:13 PM |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
parakeet_worker.pyscript and spawn it viasubprocess.Popen([sys.executable, ...])from_transcribe_parakeet.RuntimeErroron any nonzero exit (SIGABRT included), and the existing fall-through to WhisperX/Whisper handles it.Why
The v3.4.1 mitigation (60s chunks +
mx.synchronize+mx.clear_cache) made the Metal crash rarer but couldn't eliminate it.mlx::core::gpu::check_errorraises a C++ exception inside Metal'saddCompletedHandler, which has no catch — it unwinds tostd::terminate→abort(). Python can't trap that, so the whole interpreter dies, taking Flask with it. The browser then renders the generic "Load failed" page reported in #23.With process isolation the SIGABRT kills only the worker. Flask stays up, the outer
transcribe_filecatches the worker error, and the existing fallback chain transcribes the file with WhisperX/Whisper instead.Trade-off
Each Parakeet call now reloads the ~600 MB model (~5–15 s of overhead per file) since the parent-side model cache is gone. Acceptable for the one-file-at-a-time workflow reporter is using; a long-lived worker with a request pipe is a follow-up if batch throughput matters later.
Test plan
_transcribe_parakeetround-trip via the parent returns the same dict shapeapp.pyexpects (segments,language,duration,engine, plus per-segmentstart_formatted/end_formatted/words).RuntimeErrorcarrying the child's exception message; outer fallback chain takes over.app.pystill imports cleanly.num_ctxkwarg in chat/editorial_dna tests).🤖 Generated with Claude Code