Skip to content

Fix harness config sync against local Hub storage#122

Open
mfreeman451 wants to merge 4 commits intoGoogleCloudPlatform:mainfrom
mfreeman451:fix/harness-config-local-storage
Open

Fix harness config sync against local Hub storage#122
mfreeman451 wants to merge 4 commits intoGoogleCloudPlatform:mainfrom
mfreeman451:fix/harness-config-local-storage

Conversation

@mfreeman451
Copy link
Copy Markdown
Contributor

Summary

  • add Hub harness-config file endpoints so clients can read or upload files without touching Hub-local file:// paths
  • fall back to those file endpoints when harness-config upload/download URLs point at local filesystem storage
  • cover both pull and sync with focused command tests

Validation

  • go test ./pkg/hubclient -run ^ -count=1
  • go test ./pkg/hub -run ^ -count=1
  • go test ./cmd -run ^TestPullHarnessConfigFromHub_FallsBackToHubFileAPIForLocalStorageURLs -count=1
  • go test ./cmd -run ^TestSyncHarnessConfigToHub_FallsBackToHubFileAPIForLocalStorageURLs -count=1
  • live repro before fix: scion --hub https://scion.carverauto.dev --global harness-config pull codex --to ... failed by trying to open /home/scion/.scion/storage/... locally

Copy link
Copy Markdown
Contributor Author

@mfreeman451 mfreeman451 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mfreeman451 mfreeman451 marked this pull request as ready for review April 10, 2026 16:41
@mfreeman451 mfreeman451 force-pushed the fix/harness-config-local-storage branch from 5b50616 to 620da00 Compare April 11, 2026 05:15
Copy link
Copy Markdown
Contributor Author

@mfreeman451 mfreeman451 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mfreeman451
Copy link
Copy Markdown
Contributor Author

Cleaned up the retry/error-classification path.

Changes:

  • replaced the two inline strings.Contains(...) classifications with explicit local helpers:
    • isHarnessConfigNoFilesError(...)
    • isHarnessConfigMissingFileError(...)
  • kept the branch narrow and did not broaden it into a wider hubclient error refactor

Current head: 131f629f

@ptone
Copy link
Copy Markdown
Member

ptone commented Apr 11, 2026

PR #122 Review: Fix harness config sync against local Hub storage

Executive Summary

This PR adds a fallback mechanism so that when the Hub returns file:// URLs for harness-config uploads/downloads (i.e., local filesystem storage), the CLI routes through new Hub file API endpoints instead of trying to open local paths. The change is well-structured, mirrors existing template-file patterns, and is covered by focused tests. Risk level is low-to-medium -- one missing server-side guard is the only notable concern.


Critical Issues

1. Missing MaxBytesReader on multipart upload endpoint (Medium-High)

File: pkg/hub/harness_config_file_handlers.go:267

The new handleHarnessConfigFileUpload handler calls r.ParseMultipartForm(32 << 20) without first wrapping r.Body with http.MaxBytesReader. The existing template file upload handler (template_file_handlers.go:319-321) and grove workspace handler (grove_workspace_handlers.go:241-244) both set MaxBytesReader before parsing. Without it, a malicious or oversized request can force the server to buffer unbounded data into memory/disk before the 32 MB form-memory threshold even applies, since ParseMultipartForm still reads the entire body into temp files.

Suggested Fix:

// In handleHarnessConfigFileUpload, before ParseMultipartForm:
r.Body = http.MaxBytesReader(w, r.Body, maxUploadTotalSize)
if err := r.ParseMultipartForm(32 << 20); err != nil {

Observations

2. Path traversal surface area (Low -- pre-existing pattern)

File: pkg/hub/harness_config_handlers.go:218-220

The filePath extracted from the URL (strings.TrimPrefix(action, "files/")) is passed directly into storage operations as hc.StoragePath + "/" + filePath without sanitization against .. sequences. However, this mirrors the identical pattern in template_handlers.go:318-320 and is therefore not new to this PR. Noting for awareness only -- a cross-cutting fix would benefit both template and harness-config endpoints.

3. hasLocalSignedURLs / hasLocalDownloadURLs short-circuit on first match

File: cmd/transfer_urls.go:23-39

Both helpers return true as soon as any single URL has a file:// scheme. This is the right behavior for the current use case (a Hub instance either uses local storage or cloud storage uniformly), but if a future Hub mixed local and cloud URLs in the same response, the all-or-nothing fallback would silently skip cloud-signed URLs. This is an acceptable trade-off today -- just flagging the assumption.

4. In-memory multipart buffering on the client side

File: pkg/hubclient/harness_configs.go:276

UploadFilesMultipart buffers all files into a bytes.Buffer before sending. For harness-config directories this is fine (configs are small), but it doesn't scale to large file sets. This matches the existing workspace upload pattern and is acceptable here.

5. Lazy transferClient initialization is not thread-safe

File: pkg/hubclient/harness_configs.go:318-323

getTransferClient() has a check-then-write race. However, this is the exact same pattern used by workspace.go:173 and templates.go:313, so it is consistent with the codebase. In practice, the service structs are used sequentially within a single CLI command, so this is low risk.

6. Error classification via string matching

File: cmd/harness_config.go:629-641

isHarnessConfigNoFilesError and isHarnessConfigMissingFileError use strings.Contains on error messages. This is fragile if the server-side error text changes, but it's a pre-existing pattern and is explicitly acknowledged in commit message ("Make harness-config retry error classification explicit").


Positive Feedback

  • Clean separation of concerns: The transfer logic (cmd/harness_config_transfer.go) is cleanly extracted from the command handlers, making the fallback logic easy to follow and test independently.
  • Good test coverage: Both pull and sync fallback paths are tested with a realistic mock Hub server that returns file:// URLs, verifying the end-to-end fallback behavior.
  • Type alias reuse: HarnessConfigFileListResponse = TemplateFileListResponse etc. avoids duplicating response structs while keeping the types semantically distinct.
  • Consistent routing structure: The handleHarnessConfigByID dispatch mirrors the template handler exactly, making the codebase predictable.
  • escapePathSegments correctly escapes individual path segments rather than the entire path, preserving directory structure in API calls.

Final Verdict

Approve with one requested change: Add http.MaxBytesReader to handleHarnessConfigFileUpload (Issue #1) to match the existing guards on template and workspace upload endpoints. The remaining observations are either pre-existing patterns or acceptable trade-offs for the current scope.

@ptone
Copy link
Copy Markdown
Member

ptone commented Apr 11, 2026

As a heads up - have a planned major refactor of the harness/config system coming up in

https://github.com/GoogleCloudPlatform/scion/blob/main/.design/decoupled-harness-implementation.md

@mfreeman451 mfreeman451 force-pushed the fix/harness-config-local-storage branch from 131f629 to 49332f6 Compare April 11, 2026 22:51
@mfreeman451
Copy link
Copy Markdown
Contributor Author

Addressed the upload-size review point on this branch.

Changes:

  • wrapped the harness-config multipart upload path with http.MaxBytesReader(...) using maxUploadTotalSize
  • switched ParseMultipartForm(...) to the same 100MB limit used by the other local-storage upload handlers
  • return a specific Request body exceeds 100MB limit error for oversized requests instead of treating them as generic multipart parse failures

Validation:

  • go test ./pkg/hub -run ^ -count=1

Current head: c656954c

Copy link
Copy Markdown
Contributor Author

@mfreeman451 mfreeman451 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants