feat: stub agent driver for QA acceleration by Fullstop000 · Pull Request #33 · Fullstop000/Chorus

Fullstop000 · 2026-04-02T17:00:48Z

Summary

Add a lightweight stub agent binary (crates/stub-agent/) that echoes messages back through the MCP bridge with deterministic responses
New AgentRuntime::Stub driver registered server-side but hidden from the UI's create-agent modal
CHORUS_E2E_LLM=stub mode enables ~30 QA cases to run in sub-second time without real LLM backends
Token extraction: messages like reply with "hello" get echoed back; fallback to stub-reply-{n}
QA harness: ensureStubTrio(), agentNames() helpers, stub-trio preset, MSG-002 wired as proof-of-concept

Test plan

cargo build — both chorus and chorus-stub-agent compile
Integration smoke test: create stub agent via API, send DM with echo token, verify reply
Fallback token: messages without patterns get stub-reply-{n} response
/api/runtimes hides stub from the UI
CHORUS_E2E_LLM=stub npx playwright test MSG-002.spec.ts — end-to-end with Playwright

🤖 Generated with Claude Code

Spec for a lightweight stub agent binary that implements the MCP bridge protocol with deterministic echo responses, enabling ~30 QA cases to run in sub-second time without real LLM backends. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Call out resumable_session_id exhaustive match needing Stub arm - Specify UI hiding: add to all_runtime_drivers, filter in handler - Document that POST /api/agents allows stub without gating (acceptable) - Note Cargo workspace introduction for crates/stub-agent - Use distinct stub-a/b/c names to avoid collision with bot-a/b/c Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

9-task plan covering workspace setup, enum variant, driver impl, runtime filtering, MCP client binary, integration test, QA harness updates, and proof-of-concept spec wiring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replaces placeholder with full stub agent that: - Parses --mcp-config to find bridge command, spawns it as child process - Connects as MCP client via rmcp over stdio pipes - Loops: wait_for_message -> extract token -> send_message - Emits JSON status lines to stdout for the agent manager - Drains stdin in background to prevent buffer fill-up Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR introduces a deterministic “stub” agent runtime to accelerate QA by running a lightweight local agent process that replies via the MCP bridge without using real LLM backends.

Changes:

Add a new Rust workspace member (chorus-stub-agent) plus a server-side StubDriver and AgentRuntime::Stub.
Hide the stub runtime from runtime-status listing while keeping it available for API-created agents.
Update Playwright QA helpers + MSG-002 to support CHORUS_E2E_LLM=stub, and document a stub-trio preset.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`Cargo.toml`	Convert repo to a Cargo workspace to include the stub-agent crate.
`Cargo.lock`	Add dependencies required by the new stub-agent crate (and transitive adds).
`crates/stub-agent/Cargo.toml`	Define the `chorus-stub-agent` binary crate and its deps (rmcp, tokio, regex, uuid).
`crates/stub-agent/src/main.rs`	Implement MCP client loop that waits for messages and sends deterministic echo/fallback replies.
`src/store/agents.rs`	Add `AgentRuntime::Stub` and wire `parse()`/`as_str()`.
`src/agent/manager.rs`	Register the stub driver and ensure stub has no resumable session behavior.
`src/agent/runtime_status.rs`	Filter stub from `list_statuses()` output (keeps it hidden from UI runtime lists).
`src/agent/drivers/mod.rs`	Register the new `stub` driver module and include it in `all_runtime_drivers()`.
`src/agent/drivers/stub.rs`	Implement `StubDriver` spawn + stdout event parsing + model/status reporting.
`qa/cases/playwright/helpers/api.ts`	Add `ensureStubTrio()` and `agentNames()` helpers for stub mode.
`qa/cases/playwright/MSG-002.spec.ts`	Wire MSG-002 to use stub trio + mode-aware agent naming in stub mode.
`qa/QA_PRESETS.md`	Document the new `stub-trio` preset and `CHORUS_E2E_LLM=stub` usage.
`docs/superpowers/specs/2026-04-03-stub-agent-driver-design.md`	Add design spec for the stub agent driver/binary and QA integration.
`docs/superpowers/plans/2026-04-03-stub-agent-driver.md`	Add detailed implementation plan for the stub agent driver/binary and QA wiring.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-02T17:05:42Z

+            let line = line.trim();
+            if line.is_empty() || line.contains("No new messages.") {
+                continue;
+            }


wait_for_message responses include a trailing "Reply instructions" section (see bridge formatting), and this loop currently processes every line. Lines like "Reply instructions:" won’t match the [target=...] header, so the stub will still attempt to generate a token and send an extra message (often to the fallback target). Filter the loop to only handle actual message header lines (e.g., lines starting with [target=), and ignore the instruction/footer lines entirely.

Suggested change

}

}

// Only handle actual message header lines, e.g. "[target=...]"

if !line.starts_with("[target=") {

continue;

}

Copilot · 2026-04-02T17:05:42Z

+                continue;
+            }
+
+            let target = parse_target(line).unwrap_or_else(|| "#general".to_string());


Defaulting to #general when parse_target() fails is unsafe here because any unexpected/auxiliary line (including "Reply instructions" or future formatting changes) will cause the stub to send messages to #general. Prefer skipping lines where the target cannot be parsed, or fail the turn with an error event so the behavior is deterministic and doesn’t leak messages into unrelated channels.

Suggested change

let target = parse_target(line).unwrap_or_else(|| "#general".to_string());

let Some(target) = parse_target(line) else {

emit_error(&format!("Could not parse target from line: {line}"));

continue;

};

Copilot · 2026-04-02T17:05:42Z

+        .unwrap_or(200);
+
+    // Process initial prompt
+    emit_text(&format!("Processing prompt: {}", args.prompt));


This logs the entire --prompt content to the manager activity log. The prompt can be very large and may contain sensitive or noisy context (agent descriptions, unread summaries, etc.), which hurts log signal and could leak data. Consider emitting a short fixed message (or a truncated prompt preview) instead.

Suggested change

emit_text(&format!("Processing prompt: {}", args.prompt));

emit_text("Processing prompt");

Copilot · 2026-04-02T17:05:43Z

+        .into_values()
+        .next()
+        .context("No MCP server entries in config")?;
+    Ok((entry.command, entry.args))


load_mcp_config() picks the first value from a HashMap (into_values().next()), which is non-deterministic if the config ever contains multiple MCP servers. Since the driver writes a chat entry, it’s safer to explicitly select mcpServers["chat"] (or otherwise choose deterministically) so the stub always launches the expected bridge command.

Suggested change

.into_values()

.next()

.context("No MCP server entries in config")?;

Ok((entry.command, entry.args))

.get("chat")

.context("No MCP server entry named 'chat' in config")?;

Ok((entry.command.clone(), entry.args.clone()))

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… #general fallback - Select mcpServers["chat"] deterministically instead of HashMap::next() - Ignore non-message lines; require parseable target (no #general default) - Emit short processing status instead of full --prompt Made-with: Cursor

- default-members so cargo build produces chorus-stub-agent - Silence dead_code on stub Args.prompt (CLI still requires --prompt) - ensureStubTrio: retry POST /api/agents; clickComboboxOption for Radix - ERR-001: mock POST /api/attachments; assert toast copy - Stub-aware: CHN-001/003, TMT-001/002/003/005, MSG-001/004/005/011/012, REC-002 - createAgentViaUi: exact model option; createTeamQaEngViaUi uses agentNames() - ACT-002: disambiguate activity rows; AGT-004: longer senderDeleted poll Made-with: Cursor

@sender

- stub-agent: parse_content allows spaces in @sender (OS usernames) - playwright: 600s default timeout when CHORUS_E2E_LLM=stub (fixtures + slow polls) - openMembersPanel: no-op if panel already open (fixes CHN-003 after #all) - AGT-004: ensureStubTrio + stub runtime; wait for #all agent reply before delete; skip Reasoning UI edits when stub (form has no combobox) - MSG-004: longer stub poll window; CHN-002/003, REC-002, TMT-006 timeouts / Team Settings dialog Made-with: Cursor

…eady wait - MSG-001: require OK-a/b/c on messages from stub-a/b/c (human lines also contain tokens) - REC-002: poll #all history using agent rows only; wait for thread anchor in UI - waitForAppReady: 90s when CHORUS_E2E_LLM=stub Made-with: Cursor

…fore thread - MSG-001: wait until ≥3 agent senders and OK-a/b/c appear in agent bodies (not human lines); case-insensitive senderType; longer stub timeout - gotoApp: retry once on stub when sidebar shell is slow - REC-002: reload #all after history sees marks so thread anchor is in the DOM Made-with: Cursor

Made-with: Cursor

- QA_PRESETS: correct TMT stub skip list; note chorus-stub-agent build, timeouts, CHORUS_WORKERS - README: document CHORUS_E2E_LLM=stub, CHORUS_WORKERS, default per-worker server vs CHORUS_BASE_URL Made-with: Cursor

Fullstop000 and others added 11 commits April 3, 2026 00:10

build: convert to Cargo workspace, add stub-agent crate scaffold

845f4d4

feat(agent): add AgentRuntime::Stub enum variant

5825e15

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(agent): add StubDriver implementation

41fd953

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(agent): hide stub runtime from /runtimes API

bfdb3ab

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(stub-agent): add STUB_DELAY_MS configurable response delay

3e6b80f

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(qa): add stub-trio preset and ensureStubTrio helper

751c3d5

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(qa): wire MSG-002 to support CHORUS_E2E_LLM=stub mode

ab2c060

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 2, 2026 17:00

Copilot started reviewing on behalf of Fullstop000 April 2, 2026 17:01 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Fullstop000 and others added 8 commits April 3, 2026 01:13

feat(qa): wire all specs for CHORUS_E2E_LLM=stub mode

55afb16

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style: cargo fmt (stub-agent, StubDriver)

c0143cc

Made-with: Cursor

docs(qa): sync stub E2E notes with Playwright behavior

50a00ba

- QA_PRESETS: correct TMT stub skip list; note chorus-stub-agent build, timeouts, CHORUS_WORKERS - README: document CHORUS_E2E_LLM=stub, CHORUS_WORKERS, default per-worker server vs CHORUS_BASE_URL Made-with: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: stub agent driver for QA acceleration#33

feat: stub agent driver for QA acceleration#33
Fullstop000 wants to merge 19 commits intomainfrom
claude/stub-agent-driver

Fullstop000 commented Apr 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Copilot AI Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-            }
+            }
+            // Only handle actual message header lines, e.g. "[target=...]"
+            if !line.starts_with("[target=") {
+                continue;
+            }

	emit_text(&format!("Processing prompt: {}", args.prompt));
	emit_text("Processing prompt");

Conversation

Fullstop000 commented Apr 2, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants