Skip to content

fix(stt): prevent GC of actively selected plugin (#284)#366

Open
Coldaine wants to merge 2 commits intomainfrom
agent/coldvox/issue-triage-batch-1
Open

fix(stt): prevent GC of actively selected plugin (#284)#366
Coldaine wants to merge 2 commits intomainfrom
agent/coldvox/issue-triage-batch-1

Conversation

@Coldaine
Copy link
Owner

@Coldaine Coldaine commented Mar 2, 2026

Summary

Fixes a race condition in SttPluginManager where the GC timer could unload the currently active STT plugin, causing unexpected STT failures.

Changes

  • crates/app/src/stt/plugin_manager.rs: Added guard in both GC code paths to never collect the actively selected plugin
  • Cargo.lock: Bump thiserror 2.0.17 → 2.0.18

Notes

AI Agent added 2 commits March 2, 2026 15:30
Guard both GC code paths in SttPluginManager against collecting the currently
active plugin. Fixes a race where the GC timer could unload the plugin being
actively used, causing unexpected STT failures.

- Check current_plugin before filtering inactive entries in run_gc()
- Apply same guard in the background gc_task polling loop
- Rename 'now' to 'time_threshold' for clarity
- Bump Cargo.lock: thiserror 2.0.17 -> 2.0.18
Copilot AI review requested due to automatic review settings March 2, 2026 21:31
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Prevents the STT plugin GC from unloading the actively selected plugin in SttPluginManager, addressing a reported race condition that can cause unexpected STT failures, and includes minor repo hygiene updates.

Changes:

  • Add guards in both GC paths to exclude the currently selected STT plugin from GC candidate selection.
  • Bump thiserror in Cargo.lock from 2.0.17 to 2.0.18.
  • Ignore generated repo metadata JSON files in .gitignore.

Reviewed changes

Copilot reviewed 1 out of 3 changed files in this pull request and generated 5 comments.

File Description
crates/app/src/stt/plugin_manager.rs Adds “never GC selected plugin” guards to GC candidate collection paths.
Cargo.lock Updates locked thiserror version.
.gitignore Ignores generated metadata JSON files to avoid accidental commits.

Comment on lines 290 to +292
let activity = last_activity.read().await;
let current = current_plugin.read().await;
let current_id = current.as_ref().map(|p| p.info().id.clone());
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential deadlock introduced by taking last_activity.read().await and then current_plugin.read().await while other hot paths (e.g., process_audio, switch_plugin, initialize) take current_plugin first and then acquire a last_activity write lock. This inverted lock order can deadlock under load. Consider reading current_id outside the last_activity lock scope (or otherwise avoid holding both locks at once) and, if needed, capture Instant::now() after acquiring the last_activity read lock to avoid duration_since panics from concurrent updates.

Copilot uses AI. Check for mistakes.
Comment on lines +297 to 301
// NEVER GC the actively selected plugin (#284)
if Some(plugin_id.clone()) == current_id {
return None;
}
if now.duration_since(*last_used).as_secs() > ttl_secs as u64 {
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the new “never GC current plugin” filter, inactive_plugins is computed from a snapshot, but the current plugin can change after this list is built. Because the unload loop later unloads whichever plugin is current if its id matches plugin_id, a plugin selected after the snapshot could still be unloaded, reintroducing the #284 failure mode. Add a second guard at unload time (skip if plugin_id is currently selected, or re-check last_activity/TTL right before unloading).

Copilot uses AI. Check for mistakes.
Comment on lines +472 to +479
let time_threshold = Instant::now();
let ttl_secs = gc_policy.model_ttl_secs as u64;

// First, collect the IDs of inactive plugins
let inactive_plugins: Vec<String> = {
let activity = self.last_activity.read().await;
let current = self.current_plugin.read().await;
let current_id = current.as_ref().map(|p| p.info().id.clone());
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential deadlock introduced by taking last_activity.read().await and then current_plugin.read().await while other hot paths take current_plugin first and then acquire a last_activity write lock. This inverted lock order can deadlock. Consider reading current_id outside the last_activity lock scope (or otherwise avoid holding both locks at once) and capture Instant::now() after acquiring the last_activity read lock to avoid duration_since panics from concurrent updates.

Suggested change
let time_threshold = Instant::now();
let ttl_secs = gc_policy.model_ttl_secs as u64;
// First, collect the IDs of inactive plugins
let inactive_plugins: Vec<String> = {
let activity = self.last_activity.read().await;
let current = self.current_plugin.read().await;
let current_id = current.as_ref().map(|p| p.info().id.clone());
// Read the current plugin ID without holding the last_activity lock
let current_id = {
let current = self.current_plugin.read().await;
current.as_ref().map(|p| p.info().id.clone())
};
// First, collect the IDs of inactive plugins
let inactive_plugins: Vec<String> = {
let activity = self.last_activity.read().await;
let time_threshold = Instant::now();
let ttl_secs = gc_policy.model_ttl_secs as u64;

Copilot uses AI. Check for mistakes.
Comment on lines +484 to +488
// NEVER GC the actively selected plugin (#284)
if Some(plugin_id.clone()) == current_id {
return None;
}
if time_threshold.duration_since(*last_used).as_secs() > ttl_secs {
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same race as the GC task path: inactive_plugins is based on a snapshot, but the selected plugin can change after the list is computed. Since the unload loop later unloads the current plugin if its id matches plugin_id, a plugin that becomes selected after the snapshot can still be unloaded. Add an unload-time guard (skip if currently selected and/or re-check TTL from last_activity).

Copilot uses AI. Check for mistakes.
Comment on lines +297 to 301
// NEVER GC the actively selected plugin (#284)
if Some(plugin_id.clone()) == current_id {
return None;
}
if now.duration_since(*last_used).as_secs() > ttl_secs as u64 {
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new safety behavior (never GC the selected plugin), but there’s no targeted test exercising the case where the current plugin’s last_activity is older than the TTL and GC runs. Adding a unit test that backdates last_activity for the current plugin and asserts GC does not unload it would prevent regressions of #284.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Critical] Plugin manager GC can unload active plugins

2 participants