Skip to content

A bunch of streaming tool fixes#171

Draft
winterqt wants to merge 4 commits intoboldsoftware:mainfrom
winterqt:streaming-tool-fixes
Draft

A bunch of streaming tool fixes#171
winterqt wants to merge 4 commits intoboldsoftware:mainfrom
winterqt:streaming-tool-fixes

Conversation

@winterqt
Copy link
Copy Markdown
Contributor

@winterqt winterqt commented Apr 2, 2026

This is a lot. Sorry in advance.

winterqt and others added 4 commits April 2, 2026 21:36
Co-authored-by: Shelley <shelley@exe.dev>
The bash tool streams a bounded tail snapshot of stdout/stderr while a
command is running. Two line-counting bugs caused incorrect UI state:

1. BashTool split on "\n" and counted the trailing empty element as a
   real line, so the collapsed preview sliced the wrong last-five lines
   and the "Show all N lines" count was off by one.

2. More fundamentally, deriving the line count from the tail snapshot
   made it jitter as the byte window moved — counts could drop, rise,
   or start mid-line depending on the current tail position.

Fix (1) by trimming the terminal empty split element before computing
preview/counts. Fix (2) by tracking a stable total logical line count
in the bash progress writer (new `line_count` field on ToolProgress)
while keeping the tail-bounded output snapshot for display. Thread
the field through llm.ToolProgress → ChatInterface → BashTool.

Also harden the streaming bash e2e tests:
- Scope assertions to the specific bash invocation under test using
  unique command snippets instead of broad running/completed selectors
- Make FIFO cleanup idempotent to prevent 30s timeout stalls when the
  reader is already gone

Backend unit test for the new line count, plus e2e tests for streaming
preview, preview count stability, and tool progress streaming.

Co-authored-by: Shelley <shelley@exe.dev>
Co-authored-by: Shelley <shelley@exe.dev>
When a conversation with a multi-tool batch (e.g. two parallel bash
calls) was reloaded mid-execution, the UI lost track of which tools
were still running:

1. The loop executed tool calls sequentially but only recorded the
   final aggregated tool-result message. On reload, all tools in the
   batch appeared completed even if later ones hadn't started yet.

2. After the first tool completed, the partial result message could be
   missed by SSE subscribers due to a race: the message was published
   between the subscriber's DB query and the subscribe call, landing
   in neither.

Fix the backend (loop.go) to record partial tool-result messages after
each individual tool completes when more tools remain in the batch.
Add dedupeMessages() in the server to collapse superseded partial
results so the final conversation state stays clean.

Fix the SSE catch-up/subscribe race by adding a catch-up query after
subscribing. To prevent duplicate delivery of messages found by both
the catch-up and the subscriber, refactor subpub.Subscribe to return a
*Subscription handle with an AdvanceIndex method that bumps the
subscriber's internal sequence index past the catch-up high-water mark.

Add a "running..." indicator to BashTool headers. Remove the now-
unnecessary keepRunning/partialOutput workaround from the frontend
since partial results arrive reliably via SSE.

New tests:
- subpub_test.go: verify AdvanceIndex skips already-seen publishes
- loop_test.go: verify partial tool result recording
- tool_progress_stream_test.go: server-level progress, dedup, and
  SSE catch-up deduplication tests
- tool-reload-running.spec.ts: e2e tests for single-tool reload state,
  multi-tool batch reload state, and live multi-tool progress

Co-authored-by: Shelley <shelley@exe.dev>
@cla-bot cla-bot bot added the cla-signed label Apr 2, 2026
@winterqt
Copy link
Copy Markdown
Contributor Author

winterqt commented Apr 3, 2026

Looking at this a bit closer:

The loop executed tool calls sequentially but only recorded the final aggregated tool-result message. On reload, all tools in the batch appeared completed even if later ones hadn't started yet.

The first sentence is right, but the second definitely isn’t — it was the opposite, where all tools in the batch shown as not being completed. I’m convinced that last commit may just be wrong reading it back. Going to take a closer look and do some more tests.

@winterqt winterqt marked this pull request as draft April 3, 2026 05:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant