Problem
notification_pump and MCP request handlers share a single Arc<Mutex<Translator>>. Under a slow or busy LSP server this produces a head-of-line deadlock:
get_diagnostics acquires the lock and calls pull_diagnostics — a textDocument/diagnostic request with a 30 s timeout.
- While the pull is in-flight, the LSP server publishes
publishDiagnostics on stdout. notification_pump tries to acquire the same lock and stalls.
- The bounded notification channel fills. The bridge's stdout reader back-pressures and stops reading.
- The
textDocument/diagnostic response the lock-holder is waiting for can no longer arrive. The system stalls until the 30 s timeout fires.
The drop(guard) before rx.recv() in lib.rs:123 shows the intent to release the lock between messages, but the pump re-acquires it immediately on the next iteration — so under sustained push traffic it never releases long enough.
Introduced implicitly by #103 which wired publishDiagnostics into the pump for the first time, making the contention window reachable in normal operation.
Affected code
crates/mcpls-core/src/lib.rs — notification_pump (line 99) and the Mutex<Translator> it shares with serve
crates/mcpls-core/src/bridge/translator.rs — notification_cache_mut accessor used by the pump
Fix
Extract NotificationCache into its own Arc<Mutex<NotificationCache>>, independent of Translator. The pump then holds only the cache lock (a fast HashMap::insert), never competing with request handlers that hold the translator lock across LSP round-trips.
// lib.rs — before
async fn notification_pump(
...,
translator: Arc<Mutex<Translator>>,
) {
while let Some(note) = rx.recv().await {
let mut guard = translator.lock().await; // contends with request handlers
let cache = guard.notification_cache_mut();
// ...
}
}
// lib.rs — after
async fn notification_pump(
...,
cache: Arc<Mutex<NotificationCache>>,
) {
while let Some(note) = rx.recv().await {
let mut guard = cache.lock().await; // independent lock, never held across LSP I/O
// ...
}
}
Blocked By
#102
Severity
High — reproducible under any LSP server that pushes diagnostics during a get_diagnostics call (rust-analyzer, tsgo, pyright). Manifests as a 30 s stall on every diagnostics request once the notification channel fills.
Problem
notification_pumpand MCP request handlers share a singleArc<Mutex<Translator>>. Under a slow or busy LSP server this produces a head-of-line deadlock:get_diagnosticsacquires the lock and callspull_diagnostics— atextDocument/diagnosticrequest with a 30 s timeout.publishDiagnosticson stdout.notification_pumptries to acquire the same lock and stalls.textDocument/diagnosticresponse the lock-holder is waiting for can no longer arrive. The system stalls until the 30 s timeout fires.The
drop(guard)beforerx.recv()inlib.rs:123shows the intent to release the lock between messages, but the pump re-acquires it immediately on the next iteration — so under sustained push traffic it never releases long enough.Introduced implicitly by #103 which wired
publishDiagnosticsinto the pump for the first time, making the contention window reachable in normal operation.Affected code
crates/mcpls-core/src/lib.rs—notification_pump(line 99) and theMutex<Translator>it shares withservecrates/mcpls-core/src/bridge/translator.rs—notification_cache_mutaccessor used by the pumpFix
Extract
NotificationCacheinto its ownArc<Mutex<NotificationCache>>, independent ofTranslator. The pump then holds only the cache lock (a fastHashMap::insert), never competing with request handlers that hold the translator lock across LSP round-trips.Blocked By
#102
Severity
High — reproducible under any LSP server that pushes diagnostics during a
get_diagnosticscall (rust-analyzer, tsgo, pyright). Manifests as a 30 s stall on every diagnostics request once the notification channel fills.