Skip to content

Fix worktree management and copy lifecycle on Windows hosts#641

Open
OdinIversen wants to merge 5 commits into
mattpocock:mainfrom
OdinIversen:fix/windows-worktree-path-normalization
Open

Fix worktree management and copy lifecycle on Windows hosts#641
OdinIversen wants to merge 5 commits into
mattpocock:mainfrom
OdinIversen:fix/windows-worktree-path-normalization

Conversation

@OdinIversen
Copy link
Copy Markdown

Fixes three bugs that surface when running sandcastle on Windows hosts. All three are silent on Linux/macOS because they depend on Windows-specific path-separator semantics or Node-vs-shell cp differences.

Bug 1 — WorktreeManager path comparisons mix forward-slash and backslash on Windows. git worktree list --porcelain emits forward-slash paths on every platform (e.g. C:/dev/repo/.sandcastle/worktrees/foo), but path.join emits backslash paths on Windows (C:\dev\repo\.sandcastle\worktrees\foo). The create and pruneStale paths compare these raw via ===, .startsWith(), and Set.has(). Effects on Windows:

  • pruneStale wipes active worktrees out from under running sandboxes because they look orphaned to the active-set lookup.
  • create cannot reuse a managed worktree on collision (clean, dirty, or mid-rebase). The prefix check collision.path.startsWith(worktreesDir) returns false, so it falls through to the external-collision error and refuses to proceed.
  • create's by-path fallback for mid-rebase detached HEAD never matches.

Fix: normalize both sides to forward slashes via a small exported toPosix helper before comparing. When reusing an existing worktree, the returned path is normalized back to native separators so callers see the same shape as the create-new branch.

Bug 2 — CopyToWorktree fallback nests source inside partial destination. When the first cp attempt fails after partially populating the destination, the fallback cp -R would nest the source INSIDE the partial destination (e.g. node_modules/node_modules/...), turning a recoverable error into a corrupted path tree. Fix: clear the destination before retrying so the fallback always operates on a fresh target. The follow-up commit then replaces shell cp with node:fs cp for cross-platform consistency and proper symlink handling.

Bug 3 — createSandbox / createWorktree leak a worktree on failure. A failure in copyToWorktree or the onWorktreeReady host hook used to leave the just-created git worktree on disk and in git worktree list. The next run would hit "branch already checked out" and fail before reaching the failure-cleanup site. Fix: remove the worktree on failure before propagating the error.

Test coverage: added WorktreeManager.windowsPaths.test.ts with platform-independent unit tests that document the bug condition and verify the fix using literal cross-source path inputs. These run identically on Windows and Linux CI. The existing integration tests in WorktreeManager.test.ts (pruneStale > does not remove active worktrees, create > reuses preserved worktree when branch is mid-rebase, etc.) already exercise the bug call sites; they silently pass on Linux because path.join produces forward slashes there, but fail on Windows without the fix. Verified locally on Windows: with the fix, all 45 WorktreeManager tests pass; with toPosix neutered to identity, 6 of 8 relevant integration tests fail and 5 of 9 new unit tests fail with the exact assertions that document the bug.

Includes a changeset (.changeset/fix-windows-worktree-and-copy.md) marked patch.

Odin Iversen and others added 5 commits May 13, 2026 16:04
- Clear destination before fallback copy in `CopyToWorktree` to prevent nesting of source inside partial destination.
- Remove just-created worktree on failure in `createSandbox` and `createWorktree` to avoid "branch already checked out" errors.
- Normalize path separators for collision detection and worktree reuse checks in `WorktreeManager` to ensure compatibility across platforms.
…Windows compatibility; improve error handling and symlink support
- CopiedFile.mode: string → ResolveMode for type safety across executor boundary
- Remove duplicate drift-guard test (identical to "baseline missing and content differs")
- Fix test names: remove stale "(stub throws)", rename misleading "succeeds when first cp fails"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The toPosix helper was defined twice as a local closure inside `create`
and `pruneStale`. Hoist it to module scope and export it so the
slash-normalization can be unit-tested directly.

Adds WorktreeManager.windowsPaths.test.ts with platform-independent
tests that:

- Verify toPosix's conversion behavior on Windows-, POSIX-, and
  mixed-separator inputs.
- Document the bug condition with literal cross-source inputs
  (forward-slash output from `git worktree list --porcelain` vs.
  backslash output from `path.join` on Windows), and assert that the
  raw startsWith / Set.has comparisons fail.
- Verify that the fix (toPosix on both sides) makes the comparisons
  succeed.

The existing integration tests in WorktreeManager.test.ts already
exercise the call sites; they fail on Windows without toPosix (6 of 8
relevant tests, verified locally by neutering toPosix to identity),
but pass on Linux either way because path.join produces forward
slashes there. The new unit tests give Linux CI sensitivity to the
fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

Someone is attempting to deploy a commit to the Matt Pocock's projects Team on Vercel.

A member of the Team first needs to authorize it.

@LloydVincent
Copy link
Copy Markdown

Getting a CopyToWorktreeError when I try to npm run sandcastle on windows and this issue sounds related. Hope this can be verified and merged soon.

@fs-bmedkouri
Copy link
Copy Markdown

Independent repro on Windows 11 + Node v24.9 + Docker 28.4 + @ai-hero/sandcastle@0.5.10, hitting exactly the symptoms this PR fixes.

Scenario: npx sandcastle init --agent opencode --template parallel-planner-with-review on a small scratch repo, with the default copyToWorktree = ["node_modules"] from the scaffolded main.ts.

What I saw, mapped to the three bugs this PR addresses:

  1. CopyToWorktree spawn cp ENOENT — every implementer sandbox died before the agent ran, with CopyToWorktreeError: Failed to copy node_modules to worktree: spawn cp ENOENT. Confirmed dist/CopyToWorktree.js calls execFile("cp", ...) with no Windows codepath.
  2. Worktree leak on failure — after each cp failure, the git worktree stayed on disk and in git worktree list. The next iteration hit WorktreeError: Branch '...' is already checked out in worktree at .... I had to git worktree remove --force + git branch -D manually between runs.
  3. Iterations alternated between the two failures as the orchestrator looped — exactly the cascade your PR description predicts.

Workaround I am currently running with: copyToWorktree: [] and rely on the onSandboxReady: npm install hook. Works, but slow (full install per sandbox, per iteration).

Happy to test a build of this branch on Windows if useful. The unit-test approach in WorktreeManager.windowsPaths.test.ts (literal cross-source path strings, platform-independent) looks like exactly the right shape for this category of bug — would defuse the "passes on Linux CI, breaks on Windows users" trap that I would guess #410, #486, #560, and #550 all share a root cause with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants