Skip to content

Harness implementation research: link daemon, agent pill, VM-as-runtime#3417

Merged
tlgimenes merged 21 commits into
mainfrom
tlgimenes/harness-impl-research
May 21, 2026
Merged

Harness implementation research: link daemon, agent pill, VM-as-runtime#3417
tlgimenes merged 21 commits into
mainfrom
tlgimenes/harness-impl-research

Conversation

@tlgimenes
Copy link
Copy Markdown
Contributor

@tlgimenes tlgimenes commented May 21, 2026

Summary

End-to-end implementation of harness execution via a local link daemon, with the UI shifted from Runner+Harness pills to a single Agent pill driven by the link's real capabilities.

  • Link daemon — registration / heartbeat / shutdown over X-Link-Secret, capability probe for claude-code / codex / sandbox, current-link surfaced to the UI via LINK_CURRENT_GET and useCurrentLink.
  • Agent pill — replaces Runner+Harness with a single eligibility-aware pill, per-agent model selector, laptop-CLI variant (Claude Code / Codex), connect-your-laptop empty-state tile.
  • VM as runtime identityvm-proxy now resolves the sandbox runner per-user via ctx.linkRegistry (with env-singleton fallback for docker / agent-sandbox).
  • vmMap data fix — migrations 080/081 silently no-op'd on a dropped table; migration 082 performs the intended v1→v2 rekey and runnerKindsandboxProviderKind rename on connections.metadata, with a tolerant normalizeVmMap on the read path.
  • Auth hygiene — link daemon uses X-Link-Secret instead of Authorization: Bearer … so heartbeats don't trip Better Auth's INVALID_API_KEY logger; legacy well-known deprecation middleware pinned to its two specific paths.

Test plan

  • bun test apps/mesh/src/links (heartbeat / shutdown via X-Link-Secret)
  • bun test apps/mesh/migrations/082-fix-vm-map-rekey.test.ts
  • bun test apps/mesh/src/storage/virtual (normalized vmMap on read)
  • Manual: register a link from the CLI, confirm no INVALID_API_KEY log per heartbeat
  • Manual: open chat with no GitHub repo, verify Agent pill falls back to Decopilot
  • Manual: trigger /api/vm-proxy/... as a remote-user deployment, confirm 503 surfaces when no link daemon

🤖 Generated with Claude Code


Summary by cubic

Runs harnesses on the user’s laptop via a new deco link daemon with HMAC‑secured remote dispatch, replaces Runner+Harness pills with a capability‑aware Agent pill, and switches VM routing to a SandboxProviderKind with per‑thread pins and a 3‑level vmMap. Migrations are renumbered to 083–088 to avoid collisions and ensure the vmMap data fix lands.

  • New Features

    • deco link daemon (apps/mesh/src/link-daemon): registers at /api/links, heartbeats via X-Link-Secret, serves an HMAC control plane with nonce replay protection, opens a Warp tunnel (or --no-tunnel localhost), probes capabilities (claude-code, codex, decopilot-sandbox), and exposes status via GET /api/links/me; bun run smoke:link checks link health.
    • Remote dispatch: resolves a target from the 3‑level vmMap plus per‑thread pins (threads.sandbox_provider_kind/threads.harness_id), streams to the desktop daemon over its tunnel with HMAC signing, supports cancellation, and bridges SSE back to chat; returns 409 with clear reasons when the link is offline or missing a capability.
    • Links infra: TTL’d LinkRegistry (NATS JetStream / In‑mem) backs /api/links (register/heartbeat/shutdown/status); wire protocol, HMAC, and fixtures live under apps/mesh/src/links/protocol; dev can auto‑spawn the daemon and mint a local session when MESH_ALLOW_LOCALHOST_LINKS=1; CLI adds --local-sandbox-provider and --no-tunnel.
    • Agent pill: unified, capability‑aware pill with per‑agent tier mapping and a “connect your desktop” empty state; browser‑safe model lists split into claude-code-models.ts and codex-models.ts; local‑CLI provider adapters and the cli-activate path removed; @anthropic-ai/claude-agent-sdk moved to optionalDependencies.
  • Migration

    • Apply 083–088: add then drop threads.run_locally; drop host sandbox rows; rename sandbox_runner_state.runner_kindsandbox_provider_kind; add per‑thread pins (threads.sandbox_provider_kind, threads.harness_id); re‑key connections.metadata.vmMap from 2‑level to 3‑level and finish the missed runnerKindsandboxProviderKind rewrite with explicit casts and COALESCE; purge ai_provider_keys sentinels for claude-code/codex.

Written for commit 3ab12d0. Summary will update on new commits. Review in cubic

@github-actions
Copy link
Copy Markdown
Contributor

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction Action
👍 Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

Release Options

Suggested: Patch (2.339.3) — default (no conventional commit prefix detected)

React with an emoji to override the release type:

Reaction Type Next Version
👍 Prerelease 2.339.3-alpha.1
🎉 Patch 2.339.3
❤️ Minor 2.340.0
🚀 Major 3.0.0

Current version: 2.339.2

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 238 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/migrations/080-rename-runner-kind.ts">

<violation number="1" location="apps/mesh/migrations/080-rename-runner-kind.ts:37">
P1: Guard the vmMap aggregate with `COALESCE(..., '{}'::jsonb)`; otherwise rows with an empty `vmMap` can end up with `metadata = NULL` during migration.</violation>
</file>

<file name="apps/link/src/tunnel.ts">

<violation number="1" location="apps/link/src/tunnel.ts:51">
P1: Do not fall back to a hardcoded tunnel API key; require explicit configuration (input or env) instead.</violation>
</file>

<file name="apps/link/src/control-plane.ts">

<violation number="1" location="apps/link/src/control-plane.ts:77">
P2: Validate `handle` as a non-empty string (not just truthy) before calling `ensureSandbox`.</violation>
</file>

Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.

Re-trigger cubic

Comment thread apps/mesh/src/harnesses/decopilot/built-in-tools/index.ts
apiKey:
input.apiKey ??
process.env.DECO_TUNNEL_SERVER_TOKEN ??
LEGACY_TUNNEL_TOKEN,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Do not fall back to a hardcoded tunnel API key; require explicit configuration (input or env) instead.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/link/src/tunnel.ts, line 51:

<comment>Do not fall back to a hardcoded tunnel API key; require explicit configuration (input or env) instead.</comment>

<file context>
@@ -0,0 +1,85 @@
+    apiKey:
+      input.apiKey ??
+      process.env.DECO_TUNNEL_SERVER_TOKEN ??
+      LEGACY_TUNNEL_TOKEN,
+  });
+  await Promise.race([
</file context>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing shared key, not per-user. The file's header documents this explicitly ("The Warp server still uses a legacy shared key... Once Warp accepts OAuth tokens we can swap the source in one place"). The fallback was introduced in #3283 to fix a silent registration hang — this PR just extracts it into apps/mesh/src/link-daemon/tunnel.ts without changing the auth model. Removing it would break every laptop CLI client that hasn't manually set DECO_TUNNEL_SERVER_TOKEN. Tracked for replacement when Warp accepts OAuth bearers.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

v.metadata,
'{vmMap}',
(
SELECT jsonb_object_agg(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Guard the vmMap aggregate with COALESCE(..., '{}'::jsonb); otherwise rows with an empty vmMap can end up with metadata = NULL during migration.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/mesh/migrations/080-rename-runner-kind.ts, line 37:

<comment>Guard the vmMap aggregate with `COALESCE(..., '{}'::jsonb)`; otherwise rows with an empty `vmMap` can end up with `metadata = NULL` during migration.</comment>

<file context>
@@ -0,0 +1,107 @@
+        v.metadata,
+        '{vmMap}',
+        (
+          SELECT jsonb_object_agg(
+            user_key,
+            (
</file context>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c8a0e5435. Wrapped both jsonb_object_agg calls (outer over vmMap and inner over user_map) in COALESCE(..., '{}'::jsonb) so rows whose vmMap is {} — or whose nested user_map is {} — keep an empty object instead of getting metadata = NULL via jsonb_set. Also tightened the WHERE to require jsonb_typeof(metadata->'vmMap') = 'object' so non-object vmMap values are skipped. Same guard applied to the down migration.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

headers: { "content-type": "application/json" },
});
}
if (!parsed.handle) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Validate handle as a non-empty string (not just truthy) before calling ensureSandbox.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/link/src/control-plane.ts, line 77:

<comment>Validate `handle` as a non-empty string (not just truthy) before calling `ensureSandbox`.</comment>

<file context>
@@ -0,0 +1,104 @@
+          headers: { "content-type": "application/json" },
+        });
+      }
+      if (!parsed.handle) {
+        return new Response(JSON.stringify({ error: "missing_handle" }), {
+          status: 400,
</file context>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Changed the check to typeof handle === "string" && handle.length > 0 so non-string truthy bodies (e.g. arrays, objects) are also rejected with 400.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

@tlgimenes tlgimenes force-pushed the tlgimenes/harness-impl-research branch from 5633d48 to d252d00 Compare May 21, 2026 16:53
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 111 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/migrations/080-rename-runner-kind.ts">

<violation number="1" location="apps/mesh/migrations/080-rename-runner-kind.ts:37">
P1: Guard the vmMap aggregate with `COALESCE(..., '{}'::jsonb)`; otherwise rows with an empty `vmMap` can end up with `metadata = NULL` during migration.</violation>
</file>

<file name="apps/link/src/tunnel.ts">

<violation number="1" location="apps/link/src/tunnel.ts:51">
P1: Do not fall back to a hardcoded tunnel API key; require explicit configuration (input or env) instead.</violation>
</file>

<file name="apps/link/src/control-plane.ts">

<violation number="1" location="apps/link/src/control-plane.ts:77">
P2: Validate `handle` as a non-empty string (not just truthy) before calling `ensureSandbox`.</violation>
</file>

Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic

Comment thread apps/mesh/src/cli.ts Outdated
@tlgimenes tlgimenes force-pushed the tlgimenes/harness-impl-research branch 2 times, most recently from eb22f9e to 71ac220 Compare May 21, 2026 20:14
tlgimenes and others added 21 commits May 21, 2026 18:53
…d runtime

Squashes 19 commits from tlgimenes/harness-impl-research.

Laptop link daemon: `deco link` boots a local Bun.serve, opens a Warp
tunnel to deco.host, registers with the cluster's /api/links to receive
a linkSecret, then exposes an HMAC-signed control plane for sandbox
lifecycle + reverse proxy. Capability probe surfaces claude-code, codex,
and decopilot-sandbox to the cluster. SIGINT/SIGTERM deregister cleanly.

Sandbox provider abstraction: RunnerKind → SandboxProviderKind across
type/storage column/helpers. New remote-user provider plugs the user's
link daemon in via ctx.linkRegistry; docker + agent-sandbox keep an
env-singleton fallback. vmMap re-keyed to (user, branch, providerKind)
with tolerant readers for legacy 2-level + runnerKind shapes; migration
086-fix-vm-map-rekey performs the rename that earlier migrations no-op'd
on a dropped table.

Chat UI: single Agent pill replaces Runner+Harness, eligibility-aware.
Clonable agents (Start Website + GitHub-imported) skip the no-AI-provider
empty state when the laptop link is online with a CLI harness, and the
two CLI options (Claude Code / Codex desktop) surface without a cloud
key — they run on laptop-stored credentials. Decopilot fallback for
agents with no GitHub repo. Connect-laptop empty-state tile + dialog.
cli-activate provider path removed (now on the laptop link); migration
087 purges sentinel keys.

Inlining refactor (vs prior research branch): apps/link → apps/mesh/src/
{cli/commands/link.ts, link-daemon/}, @decocms/harnesses → apps/mesh/src/
harnesses/, @decocms/link-protocol → apps/mesh/src/links/protocol/.
Drops three single-consumer packages, keeps one binary (deco), folds
session/tunnel/login plumbing into the existing CLI. packages/sandbox
reaches into apps/mesh via relative imports — documented inline as a
trade-off accepted per "abstractions to a minimum".

Migrations 082-087 are renumbered to start strictly after main's latest
applied (081-async-research-jobs-result-content) so prod's executed list
stays a clean prefix and the new migrations apply fresh on deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the design decisions from brainstorming: collapse the
AgentPill (Decopilot/Claude Code/Codex) into the chat-input model
trigger as a sectioned popover with green styling on local-CLI
sections, mirror existing lock semantics, and fix the gap-collapse
bug on the trigger button.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Q6 resolved: remove the cloud-Decopilot-on-local-sandbox option
entirely (existing localStorage validation already migrates stale
values to null). Adds the verified "what stays vs. what goes"
inventory after exploring real call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Nine bite-sized tasks: trim agent-options.ts, build getAgentSections
(+ tests), AgentSection component, AgentModelPopover shell, rewrite
AgentModelTrigger with green styling and gap fix, plumb new props
through input.tsx, slim ThreadPills, delete AgentPill, and final
type/lint/fmt/test verification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ions

Removes the cloud-Decopilot-on-local-sandbox option (decopilot-laptop)
and computeAgentOptions, which gets replaced by getAgentSections in a
following commit. Keeps AGENT_OPTION_PINS + pinsForOption / pinsToOption
since chat-context.tsx still needs them. The localStorage load in
chat-context already validates against AGENT_OPTION_PINS, so stale
'decopilot-laptop' values silently migrate to null.
Introduces AgentKind/AgentSection types and a pure getAgentSections
function that returns the three sections (Decopilot, Claude Code,
Codex) shown in the new merged chat-input popover. Mirrors the gates
computeAgentOptions used (sans decopilot-laptop) and flags CLI
sections with isLocal: true to drive the green styling. Preserves
the existing getAgentModelSet helper used by the settings page.

Renamed agent-models.ts to agent-models.tsx because the Decopilot
tier entries now hold JSX icon nodes (Lightning01/Stars01/Atom01).
Renders one section in the new sectioned popover — header with the
agent title (plus " · on this laptop" suffix for CLI agents), three
tier rows with descriptions, and the existing "On" indicator. Local
sections sit on a faint bg-success/5 band; disabled sections render
opacity-40 + pointer-events-none + a small lock icon.

Also adds Bun test infrastructure (bunfig preload + happy-dom +
@testing-library/react + jest-dom matchers) so React component tests
can render in a DOM and use toBeInTheDocument(). No prior precedent
existed in the repo, so the setup file is new.
…age.json sort

Code review flagged that the root bunfig.toml preloaded happy-dom for all
workspace tests, replacing Node natives like fetch/Request/Response/URL in
packages/* tests that depend on them. Switched to Option B: removed both
bunfig.toml files (root and apps/mesh) and made apps/mesh/src/test/setup.ts
an importable module with side effects. React component tests now `import
"../../../../test/setup"` (or equivalent) at the top, guarded by
GlobalRegistrator.isRegistered to avoid double-registration. This keeps
happy-dom contained to component tests that opt in, while packages/* tests
keep running against Node natives (804 pass / 0 fail vs the prior
470/31/15 baseline).

Also reverts an unrelated @decocms/sandbox alphabetical re-sort that crept
into the Task 3 commit -- restores its prior position between marked and
mesh-plugin-workflows.
Composes AgentSection rows from a getAgentSections result. Handles
lock semantics — when lockedAgent is set, only the matching section
is interactive; the others render disabled. Row click fires onSelect
with (kind, tier) and the locking is verified by tests.
The trigger now opens the new sectioned AgentModelPopover (Decopilot,
Claude Code, Codex) instead of the old SimpleModeTierDropdown / CLI
tier list split. Closed pill goes text-success + bg-success/10 when
the active agent is a CLI variant, matching the 'Desktop connected'
green elsewhere. Picks both agent (via setPendingAgentOption) and
tier in a single click and fires the eager VM start for CLI agents
when a branch is set. Also fixes the phantom 6px gap that appeared
when the label collapses at narrow container widths.

Also passes a concrete URL to the happy-dom GlobalRegistrator in the
shared test setup so component tests whose transitive imports reach
better-auth/react createAuthClient (which reads window.location at
module init) no longer blow up on the default about:blank URL.
bun:test doesn't auto-call RTL's cleanup() the way Jest does, so DOM
from one component test was leaking into the next file's render, e.g.
a popover trigger from agent-model-trigger.test.tsx persisting into
agent-section.test.tsx and breaking getByText queries. Add an
afterEach in setup.ts that calls cleanup() to mirror the implicit
Jest behavior.

Because bun:test caches this setup module across every test file in
a run, the top-level afterEach only registers in whichever test file
imports setup first. To cover the remaining files we also install a
Bun.plugin onLoad hook that prepends an afterEach -> cleanup()
registration to any test source that already imports test/setup, so
each file's own scope gets the hook regardless of load order.
The plugin onLoad transform that injected afterEach() into every test
file was clever but surprising — future contributors would have no idea
why test files appear to have shifted line numbers in stack traces.
Replace it with an exported setupComponentTest() function that each
component test calls at module top-level. Registration lands in the
right scope without source rewrites or runtime magic.
The merged AgentModelTrigger needs both props so it can fire the eager
VM start that ThreadPills used to own when the user picks a CLI agent.
Plumbs them through the input mount without otherwise touching the
composer layout.
The agent picker now lives inside the chat-input's AgentModelTrigger.
ThreadPills shrinks back to a single BranchPill (still locked when the
thread has messages). The eager VM-start logic moves into the merged
AgentModelTrigger row click.
Replaced by the sectioned popover inside AgentModelTrigger. No
remaining importers.
Two regressions surfaced during T9 verification when running the chat
test directory as a whole:

1. happy-dom's GlobalRegistrator overwrites globalThis.TransformStream
   and WritableStream but leaves ReadableStream alone. Once a component
   test triggered registration, store-level tests that exercise ai-sdk's
   readUIMessageStream blew up with "readable should be ReadableStream"
   because stream.pipeThrough received a happy-dom TransformStream while
   the stream itself was the native ReadableStream. Snapshot the native
   classes before register() and restore them after.

2. The published @testing-library/jest-dom type augmentation targets the
   jest/vitest globals, so .toBeInTheDocument() flagged TS errors on
   bun:test's Matchers<T>. Declare the augmentation against "bun:test"
   so the matchers extended at runtime are visible to the type checker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AgentTierEntry.modelId is `string | null` (Decopilot tiers leave it
null). The existing guard checked `!model` after `lookup[entry.modelId]`,
which TypeScript correctly flagged because the index access itself
rejects a null key. Add the explicit pre-guard so the type narrows
cleanly and the runtime semantics stay identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n with prod 082-secrets

origin/main now has 082-secrets (PR #3425) at the position the earlier
renumber commit (edb42f3) parked 082-thread-run-locally. Two
migrations sharing the same 082- numeric prefix breaks Kysely's
ordering guarantee (and the locally-run 082-thread-run-locally errors
out with corrupted-migrations when the directory now sorts
082-secrets first).

Shift all six branch migrations up by one slot:

  082-thread-run-locally           -> 083-thread-run-locally
  083-drop-host-sandbox-rows       -> 084-drop-host-sandbox-rows
  084-rename-runner-kind           -> 085-rename-runner-kind
  085-thread-pins-and-vm-map-rekey -> 086-thread-pins-and-vm-map-rekey
  086-fix-vm-map-rekey             -> 087-fix-vm-map-rekey
  087-purge-cli-activate-keys      -> 088-purge-cli-activate-keys

Updates the index.ts imports / migrations map and the two test files
that self-imported their own migration module (086/087 test files).
All 88 migrations apply cleanly in the test runner; 8/8 renamed
tests pass.

Local devs who already ran any of the old-numbered migrations on this
branch must either reset their postgres data dir or rename the rows
in kysely_migration to the new numbers before pulling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nner-kind

Wrap inner and outer jsonb_object_agg calls in COALESCE(..., '{}'::jsonb) so
rows whose vmMap is an empty object — or whose nested user_map is empty —
don't end up with metadata = NULL after the rewrite (jsonb_set returns NULL
when any argument is NULL). Also restrict the UPDATE to rows where
metadata->'vmMap' is an object so non-object vmMap values are skipped.

Addresses PR review on #3417.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reject non-string truthy bodies (arrays, objects, numbers) with 400 instead
of letting them flow through to ensureSandbox.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tlgimenes tlgimenes force-pushed the tlgimenes/harness-impl-research branch from 71ac220 to 3ab12d0 Compare May 21, 2026 21:54
@tlgimenes tlgimenes merged commit f8a6651 into main May 21, 2026
17 of 19 checks passed
@tlgimenes tlgimenes deleted the tlgimenes/harness-impl-research branch May 21, 2026 21:55
tlgimenes added a commit that referenced this pull request May 22, 2026
…sandbox (#3434)

The eager `ensureVm()` added to `POST /api/:org/decopilot/threads/:id/messages`
in #3417 made the route fail when no sandbox provider could be provisioned —
e.g. CI environments that don't run a link daemon and don't set
STUDIO_SANDBOX_RUNNER (default `"remote-user"` ⇒ `provisionSandbox` throws
"No link daemon registered for user ..."). The cross-pod attach scenario in
`tests/multi-pod/scenarios/attach-cross-pod.test.ts` only drives the mock
AI provider and never needs a real VM, yet POST `/messages` was 500-ing
before it ever enqueued the run.

`resolveDispatchTarget` only consumed `vm.sandboxProviderKind` from the
returned entry, so refactor it to take the kind directly and drop the
`ensureVm` call from the POST handler. The built-in tools layer already
provisions the sandbox lazily on the first VM-tool invocation via
`ensureHandle` in `apps/mesh/src/harnesses/decopilot/built-in-tools/index.ts`,
so runs that never touch a VM-backed tool no longer pay the provisioning
cost (or its hard prerequisites).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant