v0.0.7.0 feat: decision inbox v1 — trigger-based prompt + claude --resume fix#131
Merged
Fullstop000 merged 3 commits intomainfrom May 1, 2026
Merged
v0.0.7.0 feat: decision inbox v1 — trigger-based prompt + claude --resume fix#131Fullstop000 merged 3 commits intomainfrom
Fullstop000 merged 3 commits intomainfrom
Conversation
Reland after the May-2026 dogfood revert. Two distinct bugs surfaced
during that postmortem; both are fixed here, then verified live.
## What this ships
- Storage: `decisions` table (CAS-protected resolve), Store methods,
4 unit tests.
- Lifecycle: `AgentLifecycle::resume_with_prompt` + `run_channel_id`.
Routes to `handle.prompt(...)` for live agents; falls back to
`start_agent(init_directive=envelope)` for asleep ones. Reverts
the row to open if delivery fails so the human's pick isn't lost.
- Bridge: `chorus_create_decision` MCP tool with structural validator
at the boundary. Backend forwards to `/internal/agent/{id}/decisions`.
- Handlers: 3 routes — internal create with channel-inference contract
(400 if no active-run channel), public list with status filter,
public resolve doing CAS + envelope build + resume_with_prompt
+ revert-on-failure.
- Prompt rewrite (the core change vs v0): trigger-based mandatory
framing instead of permissive "when you need". Critical-rules
splits "conversation channel" (send_message) from "verdict channel"
(chorus_create_decision) instead of conflicting "your only output
channel" + buried exception. Drops the "things you can act on
unilaterally" loophole entirely.
- UI: `DecisionsInbox` component with click-to-pick, recommended-
option highlight, optional human note, collapsible context, 5s
polling. Sidebar inbox icon toggles the view.
- Tests: 339 lib + 80 e2e (4 new round-trip cases) + 89 vitest. All
pass. clippy --all-targets -D warnings clean. cargo fmt clean.
## Pre-existing bug fixed: claude --resume on missing session file
Diagnosis from the dogfood postmortem: chorus persists `session_id`
in `agent_sessions` and passes it to `claude --resume` on every
restart. Claude hard-errors with `error_during_execution` and zero
events when the session file is missing locally — which surfaces
in chorus as an immediate `reason=Natural` turn end. Every prior
"agent did nothing" failure was actually this, not a prompt issue.
Fix: `claude.rs` verifies the session file exists at
`~/.claude/projects/<encoded-cwd>/<session_id>.jsonl` before passing
`--resume`. Falls back to a fresh session with a `warn!` on miss.
Regression test: `missing_session_file_drops_resume_flag`.
## Live cross-driver verification (real models, real runs)
| driver | model | result |
|----------|-------------------------------|---------------------------------|
| claude | sonnet | ✅ emits + full round-trip |
| kimi | kimi-code/kimi-for-coding | ✅ emits + full round-trip |
| | | (received envelope, edited |
| | | members.rs per picked body) |
| codex | gpt-5.4-mini (fresh agent) | ✅ emits with all conventions |
| gemini | gemini-2.5-flash | ✅ emits autonomously |
| opencode | deepseek/deepseek-chat | ✅ emits with H2 sections |
All five drivers picked up the prompt section, recognized the
PR-review trigger, and emitted properly-shaped payloads (H2 sections,
[verified · source] / [inferred] evidence prefixes, recommended_key).
## Known follow-up bugs (not feature defects, file separately)
1. Codex / opencode have the same shape of resume bug as claude:
chorus passes a stale thread/session id to runtimes that no longer
hold it. Codex/opencode silently exit Natural with no work; claude
errors loudly. Fresh agents work in all three. Stage-2 fix: extend
the file-exists guard to per-driver liveness checks.
2. Gemini's MCP HTTP session expires on long-running turns (>20 min),
surfacing as `Unauthorized: Session not found` when chorus_create_
decision finally fires. rmcp's StreamableHttpService session TTL
needs raising or session re-init on expiry.
Lineage: chorus-design-reviews/explorations/2026-04-30-pr-review-vertical-slice/design.md (commit 3a38b22).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the brand prefix and switch the verb from "create" to "dispatch" so the tool name doesn't tie the abstraction to Chorus and reads as the action the agent is taking (dispatching a decision request to the human, who then picks). Live-verified: a fresh claude agent received a PR-review request, recognized the trigger from the renamed prompt section, and emitted `mcp__chat__dispatch_decision` with a properly-shaped 3-option payload (headline, question, options M/H/R, recommended_key=M). No backwards-compat shim — pre-merge rename, agents only learn the new name from the system prompt at startup. Tests: 339 lib + 80 e2e all pass. cargo fmt + clippy clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VERSION 0.0.6.0 → 0.0.7.0 for the decision-inbox v1 reland. Also adds two principles to AGENTS.md: YAGNI and "no cheating for the goal" (hard constraint). And queues 3 follow-ups in TODOS.md from the v0.0.7.0 dogfood: codex/opencode resume liveness guard, rmcp HTTP session TTL, and the validator/UI/revert-path coverage gaps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reland of decision-inbox after the dogfood revert. Two distinct bugs surfaced during the postmortem; both are fixed here.
dispatch_decision(renamed from the brand-prefixed name) when an incoming request asks for a verdict — PR review, A-vs-B implementation, config flag. Human picks in a sidebar inbox; the agent's session resumes with the picked option's body inlined and acts on it. Server-side stores rows with CAS-protected resolve and reverts on delivery failure so picks aren't silently lost. Bridge MCP tool with structural validator at the boundary.send_message) from the verdict channel (dispatch_decision) instead of conflicting "your only output channel" + buried exception. Drops the "things you can act on unilaterally" loophole.--resumeguard. Pre-existing bug: chorus passed stalesession_idtoclaude --resume, claude hard-errored witherror_during_executionand zero downstream events, surfacing in chorus as immediatereason=Naturalturn end. Driver now verifies~/.claude/projects/<encoded-cwd>/<session_id>.jsonlexists before passing--resume; falls back to a fresh session with awarn!on miss. Regression test included.Test Coverage
Pre-Landing Review
Codex (gpt-5.5, reasoning_effort=medium) ran as the outside voice. No blocking findings. Verdict: Merge (option M of M/H/R). The agent independently:
cargo test decision,cargo test missing_session_file_drops_resume_flag,cargo test claude_session_file_encodes_dots_and_slashes— all passed.claude_session_fileencoding matches actual files in~/.claude/projects/.UPDATE ... WHERE status = 'open'and the revert path inhandle_resolve_decision.Cross-driver verification (live, real models)
All five drivers verified emitting
dispatch_decisionautonomously on fresh agents:Plan Completion
All r7 design items shipped:
decisionstable + CAS resolve,resume_with_promptlifecycle method, channel-inference contract,dispatch_decisionMCP tool + bridge validator, real handlers (create/list/resolve), envelope builder, UI inbox with sidebar toggle, claude--resumeguard, prompt rewrite, e2e tests. Plus the post-revert tool rename (chorus_create_decision→dispatch_decision) per pre-merge feedback.Known follow-ups (queued in TODOS.md, not gating ship)
Naturalon stale thread/session id rather than erroring loudly.StreamableHttpServicesession TTL — long agent turns (>20 min) cause MCP session expiry; tool calls returnUnauthorized: Session not foundeven with valid payloads.resume_with_promptActive/Asleep paths, resume-failure → revert e2e, UI component tests.Test plan
cargo test— 528 tests pass (339 lib + 80 e2e + 22 store + 61 store_tests + 26 others)cargo clippy --all-targets -- -D warningscleancargo fmt --checkcleancd ui && npx tsc --noEmitclean, 85/85 vitest tests passdispatch_decisionautonomously with valid payloadsmembers.rsper the picked option's body)Lineage: chorus-design-reviews — 2026-04-30 vertical-slice design (commit 3a38b22).
🤖 Generated with Claude Code