fix(whisper): bound version guard to < 4.8 + guard explicit CUDA devices by anandray · Pull Request #1052 · rocketride-org/rocketride-server

anandray · 2026-06-01T22:13:25Z

Summary

Follow-up to #1043, addressing two non-blocking nits from kwit75's review:

1. Bound the version guard to (4,7) <= ct2 < (4,8)

The previous ct2 >= (4, 7) check would have kept 4.8/4.9/5.x forced to CPU even after CTranslate2 ships a fix. Adding < (4, 8) as an upper bound means the guard auto-lifts once 4.8+ is installed.

2. Guard explicit device='cuda'/device='cuda:N' in local mode

_check_gpu_compatible() was only called on the auto-detect path (device=None). Callers passing an explicit CUDA device bypassed the probe entirely and could still hit the SIGABRT. The fix adds the same probe check for any non-CPU explicit device:

elif device != 'cpu' and not WhisperLoader._check_gpu_compatible():
    logger.warning('ctranslate2 CUDA probe failed for explicit device=%r — will use CPU', device)
    device = 'cpu'

Server mode was already fully protected (probe runs before allocate_gpu, which is the only GPU selection path in that mode).

Test plan

./builder model_server:test — 35 passed, 11 deselected on 8× H200

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Enhanced GPU compatibility checks for audio processing with automatic fallback to CPU when GPU requirements aren't met, ensuring stable operation across different hardware configurations.

… devices Addresses two post-merge nits from kwit75 on #1043: 1. Narrow version guard to (4,7) <= ct2 < (4,8) so the restriction auto-lifts when ctranslate2 4.8+ ships the cuBLAS 12.8.4 fix, rather than trapping 4.8/4.9/5.x indefinitely. 2. Run _check_gpu_compatible() for explicit device='cuda'/'cuda:N' in local mode, not just for auto-detect (device=None). The SIGABRT can occur regardless of how the CUDA device was selected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-06-01T22:13:36Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7bcc2c9f-5de3-47f2-92f1-3269e28624e6

📥 Commits

Reviewing files that changed from the base of the PR and between 08558ce and ee9b364.

📒 Files selected for processing (1)

packages/ai/src/ai/common/models/audio/whisper.py

📝 Walkthrough

Walkthrough

WhisperLoader's GPU compatibility probe is tightened to enforce a specific version range (ctranslate2 4.7–4.8 with CUDA 12.8) rather than a broad ≥4.7 check, with updated documentation. When the probe fails for an explicitly requested non-CPU device, the loader now logs a warning and gracefully falls back to CPU instead of attempting GPU initialization.

Changes

GPU Compatibility Handling in WhisperLoader

Layer / File(s)	Summary
GPU compatibility version guard `packages/ai/src/ai/common/models/audio/whisper.py`	Comment describing the ctranslate2 CUDA 12.8 compatibility probe is updated. The version check condition changes from `ctranslate2 >= 4.7` to bounded range `(4,7) <= ct2 < (4,8)` to match upstream fix timing expectations.
Device fallback when GPU incompatible `packages/ai/src/ai/common/models/audio/whisper.py`	Local-mode model loading now detects when a non-CPU device is explicitly requested but GPU compatibility fails, logs a warning, and forces the device to CPU rather than proceeding with incompatible GPU initialization.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

rocketride-org/rocketride-server#1036: Also modifies WhisperLoader GPU-compatibility logic and device-selection fallback behavior for CTranslate2 CUDA compatibility.
rocketride-org/rocketride-server#1043: Further tightens _check_gpu_compatible() ctranslate2 version guard logic with the same GPU-to-CPU fallback approach.

Suggested labels

module:ai

Suggested reviewers

jmaionchi
stepmikhaylov
Rod-Christensen

Poem

🐰 When GPUs dance with CUDA's glow,
We bound the versions—four-point-oh,
If compatibility fades to black,
We gracefully fall to CPUs back,
With warnings logged for those who know.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the two main changes: narrowing the CTranslate2 version guard to <4.8 and adding guards for explicit CUDA devices in local mode.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/whisper-probe-nits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-01T22:13:40Z

No description provided.

kwit75

LGTM — both nits from #1043 are correctly and completely addressed. Traced it end-to-end:

1. Version bound (4, 7) <= ct2 < (4, 8) — ct2 = tuple(int(x) for x in ctranslate2.__version__.split(".")[:2]) is always a 2-tuple, so this fires on exactly the 4.7.x family and auto-lifts at the first 4.8 release (4.8.0 → (4,8) fails < (4,8); 4.10 → (4,10) and 5.x → (5,0) correctly excluded — no lexical-string pitfall since components are int-cast). Comment now matches behavior. The parse-failure sentinel (999,999) skips the version gate but the real StorageView.from_array check still runs in the isolated subprocess, so an incompatible build still SIGABRTs there → returncode != 0 → CPU. Fail-open is safe.

2. Explicit-device fallback — the new elif device != 'cpu' and not _check_gpu_compatible() in local mode is correct on every case:

compatible explicit cuda/cuda:N → elif is True and not True == False → device unchanged → still uses GPU (no regression);
incompatible explicit cuda/cuda:N → falls to CPU with a warning;
device='cpu' → short-circuits on device != 'cpu', so the (subprocess) probe isn't even invoked;
downstream ':' in device index parse and the device is None auto branch are untouched; server mode was already guarded at the allocate_gpu branch.

Probe is cached + subprocess-isolated, so the explicit path reuses the result and the StorageView check is the backstop. The float16 → int8 CPU fallback still applies after the new demotion (separate if device == 'cpu' block runs after).

Approving. Only the Build matrix is still pending — merge once it greens.

One ordering note for downstream: saas #182 currently pins the submodule at 08558ce5 (#1043 only). After this lands, re-point #182 to the post-#1052 develop tip so SaaS picks up the nits too, then merge #182.

anandray requested review from Rod-Christensen, jmaionchi and stepmikhaylov as code owners June 1, 2026 22:13

anandray requested review from asclearuc, dsapandora and kwit75 June 1, 2026 22:13

github-actions Bot added the module:ai AI/ML modules label Jun 1, 2026

ci: retrigger CI (Windows HuggingFace network flake)

1c5a953

kwit75 approved these changes Jun 1, 2026

View reviewed changes

anandray enabled auto-merge (squash) June 1, 2026 22:47

anandray disabled auto-merge June 1, 2026 22:48

kwit75 merged commit da2a7d2 into develop Jun 1, 2026
20 checks passed

kwit75 deleted the fix/whisper-probe-nits branch June 1, 2026 23:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(whisper): bound version guard to < 4.8 + guard explicit CUDA devices#1052

fix(whisper): bound version guard to < 4.8 + guard explicit CUDA devices#1052
kwit75 merged 2 commits into
developfrom
fix/whisper-probe-nits

anandray commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

kwit75 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anandray commented Jun 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

github-actions Bot commented Jun 1, 2026

Uh oh!

kwit75 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anandray commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading