Skip to content

Poke#12

Merged
nathanwhit merged 3 commits into
mainfrom
poke
Jun 13, 2026
Merged

Poke#12
nathanwhit merged 3 commits into
mainfrom
poke

Conversation

@nathanwhit

Copy link
Copy Markdown
Owner

No description provided.

A manager can pause mid-task ('done X, next I'll do Y') and end its turn
without acting, leaving the objective ACTIVE with nothing driving it. The
only manager nudges were reactive (worker-finished, PR-merged) and were
dropped if the manager session was already terminal.

Add a supervisor tick to the scheduler loop. For each active objective it
re-pokes the live manager ONLY when no worker is making progress -- a
non-terminal worker (even one on a long quiet build) counts as activity, so
a progressing objective is left alone. A staleness check skips a manager
that's actively streaming output, and a per-objective cooldown keeps it from
re-poking every tick. The poke carries an inline state snapshot (worker
statuses + PRs) since the manager has no read tools of its own.

Live-manager only; respawning a terminal manager is left for a follow-up.
The idle supervisor only re-poked a LIVE manager; if the manager session
went terminal (completed/quiesced/failed) while the objective was still
active, nothing could react to worker completions or PR events and the
objective stalled forever.

Extend the supervisor: when an active objective has no live manager and no
worker is making progress, respawn one. The fresh manager gets the original
prompt plus an inline state snapshot (worker statuses + PRs) framed as a
resume, so it continues rather than restarting completed work or duplicating
workers. Respawns are cooldown-gated like pokes and capped per objective
(maxManagerSessions); past the cap the supervisor stops respawning and raises
a user-facing question instead of burning tokens on a manager that keeps
dying.

Decision logic (superviseDecisions) is split from the side effects so poke /
respawn / escalate are unit-tested directly.
Per-session checkouts (<WorkRoot>/<sessionID>) were created during prepare
and never removed, so long-lived targets accumulated a full repo checkout per
worker forever -- nothing in cleanupRun touched disk, and the archived/dirty
workspace statuses were defined but never set.

Add a periodic GC pass (every 5m, plus a startup sweep) that rm -rf's a
workspace's checkout on its target and marks it archived. It is deliberately
conservative to avoid the sharing hazard (dependents inherit a checkout, the
manager publishes a PR from a worker's checkout, follow-ups reuse a PR-branch
checkout): a workspace is reclaimed only once its OBJECTIVE is terminal AND no
non-terminal session still references it. The shared bare mirror cache is
kept. Passes run under a TryLock so they don't overlap and are idempotent
(archived workspaces are skipped); a transient remote rm failure is retried
next pass, while a vanished target archives to stop retrying. GC runs async so
a remote rm never stalls scheduling.
@nathanwhit nathanwhit enabled auto-merge (squash) June 13, 2026 05:19
@nathanwhit nathanwhit merged commit 58db586 into main Jun 13, 2026
1 check passed
@nathanwhit nathanwhit deleted the poke branch June 13, 2026 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant