Skip to content

Route Hub agent exec through the runtime dispatcher#120

Closed
mfreeman451 wants to merge 6 commits intoGoogleCloudPlatform:mainfrom
mfreeman451:fix/hub-agent-exec
Closed

Route Hub agent exec through the runtime dispatcher#120
mfreeman451 wants to merge 6 commits intoGoogleCloudPlatform:mainfrom
mfreeman451:fix/hub-agent-exec

Conversation

@mfreeman451
Copy link
Copy Markdown
Contributor

Summary

  • add Hub-side agent exec dispatch support so /api/v1/agents/{id}/exec works
  • wire exec through the existing HTTP/control-channel runtime broker clients
  • add focused handler and dispatcher coverage for the new exec path

Validation

  • go test ./pkg/hub -run '^TestHandleAgentExec_DispatchesToRuntimeBroker$|^TestHTTPAgentDispatcher_DispatchAgentMessage$|^TestHTTPAgentDispatcher_DispatchAgentStart_WithGroveProviderPath$|^TestHTTPAgentDispatcher_DispatchAgentMessage_UsesSlugNotName$'

Context

This fixes the Hub-side 404 seen from scion look against hosted agents: the client already called the Hub exec endpoint, but the server never handled the exec action.

@mfreeman451 mfreeman451 marked this pull request as draft April 10, 2026 03:29
Copy link
Copy Markdown
Contributor Author

@mfreeman451 mfreeman451 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mfreeman451 mfreeman451 marked this pull request as ready for review April 10, 2026 04:17
Copy link
Copy Markdown
Contributor Author

@mfreeman451 mfreeman451 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ptone
Copy link
Copy Markdown
Member

ptone commented Apr 11, 2026

PR #120 Review: Route Hub agent exec through the runtime dispatcher

Executive Summary

This PR wires the existing runtime broker exec endpoint through the Hub's HTTP API, fixing a 404 from scion look against hosted agents. The change is low-to-medium risk — it follows established dispatch patterns faithfully, introduces proper auth gating, and the underlying exec implementation uses safe exec.CommandContext (no shell injection). There are two notable observations around hardcoded exit codes and timeout propagation that should be addressed.


Critical Issues

1. Timeout parameter accepted but never propagated to the runtime broker

Severity: Medium
File: pkg/hub/handlers.go (new handleAgentExec) → pkg/hub/broker_http_transport.gopkg/runtimebroker/handlers.go:1286

The Hub handler accepts timeout from the client and marshals it into the broker request body. However, the runtime broker's execCommand handler ignores the timeout field entirely:

// runtimebroker/handlers.go:1286
output, err := rt.Exec(ctx, id, req.Command)  // no timeout passed

The timeout is serialized on the wire but has no effect. A caller specifying "timeout": 5 would expect the command to be killed after 5 seconds, but it will run indefinitely (limited only by the HTTP request context).

Impact: Commands that hang will block until the HTTP client times out, not the requested duration. This is a silent contract violation.

Suggested Fix: This is pre-existing in the runtime broker and not introduced by this PR, but the PR is now exposing this path to Hub clients. Consider documenting this limitation or, ideally, wiring a context.WithTimeout in execCommand.


2. Hardcoded exitCode: 0 in Hub response

Severity: Low-Medium
File: pkg/hub/handlers.go (new handleAgentExec, around the writeJSON call)

writeJSON(w, http.StatusOK, map[string]interface{}{
    "output":   output,
    "exitCode": 0,  // always 0
})

The runtime broker has the same issue (with a TODO comment at runtimebroker/handlers.go:1298). The Hub handler compounds this by also hardcoding exitCode: 0. If the underlying command fails with a non-zero exit, the caller receives exitCode: 0 — a misleading success signal.

Suggested Fix: At minimum, propagate the exit code from the broker response. Currently ExecResponse already has ExitCode int, so the Hub transport could decode it and return it alongside output:

// In broker_http_transport.go ExecAgent:
var result struct {
    Output   string `json:"output"`
    ExitCode int    `json:"exitCode"`
}

Then return both values from the dispatcher chain.


Observations

3. handleAgentExec duplicates the requireRuntimeBrokerAssigned check inline

File: pkg/hub/handlers.go (new handleAgentExec)

The handler manually checks agent.RuntimeBrokerID == "" instead of calling the new requireRuntimeBrokerAssigned(agent) helper that this same PR introduces and uses everywhere else in httpdispatcher.go. This is inconsistent — the dispatcher will also check, making the handler check redundant but not harmful.

Suggested Fix:

if err := requireRuntimeBrokerAssigned(agent); err != nil {
    ServiceNotReady(w, "Agent has no runtime broker assigned — ...")
    return
}

4. No validation on timeout range

File: pkg/hub/handlers.go (new handleAgentExec)

The handler validates len(req.Command) == 0 but does not validate timeout. A negative or extremely large timeout value is accepted silently. While currently unused (see issue #1), if timeout is wired in the future, this could be problematic.

5. map[string]interface{} used for JSON response instead of a typed struct

File: pkg/hub/handlers.go (new handleAgentExec)

The response uses map[string]interface{}{"output": ..., "exitCode": 0} while the runtime broker has a proper ExecResponse struct. Using a typed struct would be more idiomatic and maintainable.

6. String literal → constant migration is a welcome cleanup

The migration of bare "start", "stop", etc. strings to api.AgentAction* constants across both pkg/hub/handlers.go and pkg/runtimebroker/handlers.go is a solid improvement that reduces typo risk and enables compiler-assisted refactoring.


Positive Feedback

  • Auth is correctly gated. The exec action properly goes through the user/agent auth check in handleAgentAction (it is not in the bypass list). In the grove route, exec is correctly added to the ActionAttach authorization policy check alongside start, stop, restart, and message.
  • No command injection risk. The runtime uses exec.CommandContext with a string slice, not a shell — safe by construction.
  • Clean interface extension. DispatchAgentExec follows the exact same pattern as DispatchAgentLogs (precondition check → endpoint lookup → client call), making it easy to review and maintain.
  • Thorough mock updates. All mock dispatchers across test files (mockDispatcher, noopDispatcher, createAgentDispatcher, recordingDispatcher, brokerMockDispatcher) are updated with the new interface method.
  • Good test coverage. Both the direct agent route (/api/v1/agents/{id}/exec) and the grove route (/api/v1/groves/{id}/agents/{slug}/exec) are covered.

Final Verdict

Approve with minor changes. The core wiring is correct, safe, and follows established patterns. The hardcoded exitCode: 0 (issue #2) is the most actionable item — it creates a misleading API contract. The timeout issue (#1) is pre-existing but worth tracking. Neither is blocking for merge given the TODO already exists in the runtime broker.

@ptone
Copy link
Copy Markdown
Member

ptone commented Apr 11, 2026

I was actually hitting this the other day as well - may patch this up to get merged

- Propagate exit code from runtime broker response through the full
  dispatch chain instead of hardcoding 0 in the hub handler.
- Wire context.WithTimeout in the broker's execCommand handler so the
  timeout parameter is actually enforced.
- Use requireRuntimeBrokerAssigned helper for consistency.
- Add validation for negative timeout values.
- Use typed struct for exec JSON response instead of map[string]interface{}.
- Fix trailing newline in agent_actions.go.
@ptone
Copy link
Copy Markdown
Member

ptone commented Apr 11, 2026

rebased commits locally and pushed

@ptone ptone closed this Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants