Poke#12
Merged
Merged
Conversation
A manager can pause mid-task ('done X, next I'll do Y') and end its turn
without acting, leaving the objective ACTIVE with nothing driving it. The
only manager nudges were reactive (worker-finished, PR-merged) and were
dropped if the manager session was already terminal.
Add a supervisor tick to the scheduler loop. For each active objective it
re-pokes the live manager ONLY when no worker is making progress -- a
non-terminal worker (even one on a long quiet build) counts as activity, so
a progressing objective is left alone. A staleness check skips a manager
that's actively streaming output, and a per-objective cooldown keeps it from
re-poking every tick. The poke carries an inline state snapshot (worker
statuses + PRs) since the manager has no read tools of its own.
Live-manager only; respawning a terminal manager is left for a follow-up.
The idle supervisor only re-poked a LIVE manager; if the manager session went terminal (completed/quiesced/failed) while the objective was still active, nothing could react to worker completions or PR events and the objective stalled forever. Extend the supervisor: when an active objective has no live manager and no worker is making progress, respawn one. The fresh manager gets the original prompt plus an inline state snapshot (worker statuses + PRs) framed as a resume, so it continues rather than restarting completed work or duplicating workers. Respawns are cooldown-gated like pokes and capped per objective (maxManagerSessions); past the cap the supervisor stops respawning and raises a user-facing question instead of burning tokens on a manager that keeps dying. Decision logic (superviseDecisions) is split from the side effects so poke / respawn / escalate are unit-tested directly.
Per-session checkouts (<WorkRoot>/<sessionID>) were created during prepare and never removed, so long-lived targets accumulated a full repo checkout per worker forever -- nothing in cleanupRun touched disk, and the archived/dirty workspace statuses were defined but never set. Add a periodic GC pass (every 5m, plus a startup sweep) that rm -rf's a workspace's checkout on its target and marks it archived. It is deliberately conservative to avoid the sharing hazard (dependents inherit a checkout, the manager publishes a PR from a worker's checkout, follow-ups reuse a PR-branch checkout): a workspace is reclaimed only once its OBJECTIVE is terminal AND no non-terminal session still references it. The shared bare mirror cache is kept. Passes run under a TryLock so they don't overlap and are idempotent (archived workspaces are skipped); a transient remote rm failure is retried next pass, while a vanished target archives to stop retrying. GC runs async so a remote rm never stalls scheduling.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.