fix(createSandbox): report token usage on reused sandboxes#683
Open
masone wants to merge 2 commits into
Open
Conversation
The reuse factory built by createSandbox passed only hostWorktreePath, sandboxRepoPath and applyToHost into the orchestrator's withSandbox context, omitting bindMountHandle. The orchestrator gates session capture on bindMountHandle, so capture (and the token-usage parsing that depends on the captured JSONL) was silently skipped on every sandbox.run() — reused sandboxes reported zero token usage while one-shot run(), whose primary factory forwards the handle, reported it correctly. Forward the handle (narrowed to the bind-mount provider case) so capture runs on reused sandboxes too, matching run().
Add two tests for the bindMountHandle forwarding fix: - a reused bind-mount sandbox captures the session JSONL and reports token usage (sessionFilePath + usage populated). This fails on the pre-fix factory (sessionFilePath undefined). - a non-bind-mount (isolated) provider does NOT capture: a sessionId is still extracted, but the handle is not forwarded, so capture is skipped. The mock bind-mount handle's copyFileOut synthesizes the session JSONL, avoiding the need for a writable in-sandbox projects dir; HOME is redirected to a temp dir so capture never touches the real ~/.claude.
|
@masone is attempting to deploy a commit to the Matt Pocock's projects Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
When you reuse a sandbox across multiple runs:
The one-shot
run()reports token usage fine. Only the reusedcreateSandbox+sandbox.run()path comes back empty.Why
After each run, the orchestrator copies the agent's session file out of the
sandbox to the host and reads token usage from it. That step only happens when
it has a handle to the sandbox's filesystem (
bindMountHandle).run()passes that handle through.createSandboxbuilds its own internalfactory to reuse the container across runs — and that factory was passing
everything except the handle. So the copy-out step was silently skipped every
time, and with no session file there was nothing to read usage from.
The fix
Forward the bind-mount handle through the reuse factory, the same way
run()already does. The handle is narrowed to the bind-mount provider case (so an
isolated/no-sandbox handle is never mistaken for a bind-mount one). With the
handle present, the existing capture code runs and
sessionFilePath/ tokenusage are populated on reused sandboxes too.
Changes
src/createSandbox.ts— carry the bind-mount handle into the handle contextand forward it into the orchestrator's per-run context, narrowed to
bind-mount providers.
src/createSandbox.test.ts— two regression tests (see below)..changeset/— patch bump.Tests
Added two tests at the
createSandboxlevel:sessionFilePathandusageare populated. Verified to fail on thepre-fix factory (
expected undefined to be defined).sessionIdisstill extracted, but the handle isn't forwarded, so capture is skipped. This
pins the narrowing.