Stuck sub-agents remain in Running status indefinitely, consuming slots from the max_agents limit (default 10, hard max 20). No global heartbeat or timeout auto-cancels hung sub-agents, and the cleanup() method only removes terminal agents older than 1 hour. This causes #2603 — users can't start new sessions because slots are full of zombie sub-agents. Fix: add a configurable sub-agent heartbeat timeout (default 5 minutes) that auto-cancels non-responsive agents.
Source: investigation of #2603 slot exhaustion from hung sub-agents.
Stuck sub-agents remain in
Runningstatus indefinitely, consuming slots from themax_agentslimit (default 10, hard max 20). No global heartbeat or timeout auto-cancels hung sub-agents, and thecleanup()method only removes terminal agents older than 1 hour. This causes #2603 — users can't start new sessions because slots are full of zombie sub-agents. Fix: add a configurable sub-agent heartbeat timeout (default 5 minutes) that auto-cancels non-responsive agents.Source: investigation of #2603 slot exhaustion from hung sub-agents.