brokercore: make MITM response-body cap configurable (default 1 GiB)#198
Closed
devniel wants to merge 1 commit into
Closed
brokercore: make MITM response-body cap configurable (default 1 GiB)#198devniel wants to merge 1 commit into
devniel wants to merge 1 commit into
Conversation
MaxResponseBytes was a hardcoded 100 MiB cap on the response bodies streamed back to agents on the MITM proxy ingress. That bound made sense for JSON LLM completions but silently truncates anything bulkier — git clones of medium-or-larger repos, model artifact downloads, dataset fetches — surfacing as 'fetch-pack: invalid index-pack output' for git and broken downloads otherwise. The truncation point itself isn't logged today; the agent just sees a malformed end-of-stream and has to guess. Raise the default to 1 GiB (covers ~all real-world bulk transfers an agent does day-to-day) and make it overridable via AGENT_VAULT_MAX_RESPONSE_BYTES. Accepts a bare byte count or one of KB/MB/GB/TB (base 1024, case-insensitive). '0' opts out entirely. Includes unit tests for parseByteSize (positive cases + overflow + malformed) and resolveMaxResponseBytes (default, override, opt-out, fallback).
|
Just ran into this with a large git repository! This would be very nice to have ❤️ |
12 tasks
saifsmailbox98
added a commit
that referenced
this pull request
Jun 1, 2026
## Summary The MITM proxy hardcoded a 100 MB response body cap (`MaxResponseBytes`) and a 64 MB request body cap (`MaxProxyBodyBytes`) that silently truncated large transfers. `git clone` of any repo with >100 MB of history (TypeScript, React, Kubernetes, most company monorepos) failed with a confusing `early EOF` / `unexpected disconnect while reading sideband packet` error. The server logged `status=200` with no warning — users had no way to tell the proxy was at fault. Confirmed by reproduction: cloning `microsoft/TypeScript` through the proxy truncated at exactly 104,857,600 bytes (the hardcoded limit). Same root cause reported in #210 and external draft PR #198. **Response body cap: now unlimited by default** Response bodies are streamed with a 32 KB `io.Copy` buffer — no OOM risk regardless of size. The old 100 MB cap only broke legitimate transfers. Default is now `0` (unlimited). Operators who want a bandwidth guard can set `--max-response-bytes` / `AGENT_VAULT_MAX_RESPONSE_BYTES`. When a limit IS configured: - **Known oversize** (`Content-Length > limit`): returns 502 Bad Gateway with `X-Agent-Vault-Proxy-Error` and a clear JSON error before streaming any body - **Mid-stream discovery** (chunked, hits limit): aborts the connection via `http.ErrAbortHandler` — client sees a connection reset, not silently corrupted data This follows the industry standard (nginx, Squid, Envoy): reject, never silently truncate. **Request body cap: configurable, default 1 GiB** Raised from 64 MB to 1 GiB via `--max-request-bytes` / `AGENT_VAULT_MAX_REQUEST_BYTES`. Fixes `git push` of large repos and bulk file uploads that previously hit 413. **OOM fix: conditional body materialization** Previously, every request body was fully buffered in RAM by `MaterializeRequestBody` — up to the body cap per concurrent request. Now: - **Known-length bodies without body substitutions** (the common case: git, file uploads, API calls): streamed directly to the upstream without materialization. `Content-Length` is forwarded from the original request. - **Unknown-length (chunked) or body-substitution bodies**: materialized as before, but capped at 64 MB (`MaxMaterializeBytes`) regardless of the request cap. Body substitutions target small API payloads (JSON, form-encoded), never large file uploads. The reorder is safe: `Inject()` takes `(ctx, vaultID, host, path)` — no body dependency — so it moves before materialization. `HasBodySubstitutions()` (new helper) checks the substitution surfaces before deciding the path. **Observability** Both the 502 rejection and mid-stream abort paths log `slog.Warn` with host, path, bytes streamed, and the configured limit. Previously: zero logging on truncation. ## Type of change - [x] Bug fix - [x] New feature - [ ] Refactor / cleanup - [ ] Documentation - [ ] CI / build ## Test plan - [x] Existing tests pass (`make test` — all 20 packages green) - [x] Added/updated tests for new behavior (9 new tests): | Test | Validates | |------|-----------| | `TestResponseLimitRejectsKnownOversizeWith502` | Content-Length > limit → 502 + `X-Agent-Vault-Proxy-Error` + JSON error | | `TestResponseLimitAbortsChunkedMidStream` | Chunked body > limit → connection abort, not clean 200 | | `TestResponseExactFitNoAbort` | Body exactly at limit → completes normally (one-byte probe fix) | | `TestResponseUnlimitedByDefault` | Default `MaxResponseBytes=0` → large response passes through | | `TestRequestBodyCapConfigurable` | Body > configured limit → 413 | | `TestDefaultMaxRequestBytesZeroMeansDefault` | `Options{}` zero-value → 1 GiB default | | `TestHasBodySubstitutions` | Unit test for surface detection (8 subcases) | - [x] Manual testing: built binary, started server, cloned `microsoft/TypeScript` through the proxy — completed successfully (was failing before at 100 MB) Supersedes #198. ## Security checklist - [x] No secrets or credentials in code - [x] No new unauthenticated endpoints - [x] Input validation on new API surfaces - [x] Checked for OWASP top 10 (injection, XSS, etc.) --------- Co-authored-by: saif <11242541+saifsmailbox98@users.noreply.github.com>
Collaborator
|
superseded by #227 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
brokercore.MaxResponseBytesis a hardcoded100 << 20(100 MiB) cap on the response body streamed back to agents on the MITM proxy ingress, applied viaio.Copy(w, io.LimitReader(resp.Body, MaxResponseBytes))ininternal/mitm/forward.goandinternal/mitm/websocket.go.That bound makes sense for JSON LLM completions but silently truncates anything bulkier than 100 MiB:
git cloneover HTTPS of a medium-or-larger repo: hits the cap mid-pack → surfaces to the user asConcretely, a 282 MiB pack on a real repo got truncated at exactly 100 MiB (deterministic, reproduces 1:1 across runs).
Model artifact downloads (Hugging Face checkpoints, ONNX bundles): truncated bundles fail integrity checks downstream.
Datasets / backups / any other bulk transfer: short-reads with no error logged.
The truncation point itself isn't logged today either — the agent just sees a malformed end-of-stream and has to guess at the cause.
Change
DefaultMaxResponseBytes). That covers ~all real-world bulk transfers an agent does day-to-day while still keeping a reasonable defense-in-depth ceiling.AGENT_VAULT_MAX_RESPONSE_BYTES. Accepts:1073741824KB/MB/GB/TB(base 1024, case-insensitive, whitespace-tolerant):5GB,256mb,2TB0to opt out of the cap entirely (operator's call)var MaxResponseBytes. Env mutations after startup are ignored — same lifecycle as the configurableResponseHeaderTimeoutin mitm: make upstream ResponseHeaderTimeout configurable (default 5m) #196.Tests
TestParseByteSize— positive cases (raw ints, all suffix forms, case-insensitivity, whitespace), negative cases (junk, negative numbers, int64 overflow).TestResolveMaxResponseBytes— default, explicit override,0opt-out, malformed env value falling back to default.Both green; the full
internal/mitm/suite is also green with the new value plumbed through (io.LimitReaderalready takes anint64, so call sites needed no changes).Combined with #197
This is the third in a small series that came out of trying to
git clonethrough the proxy on a real repo:ResponseHeaderTimeoutfor reasoning-model latenciesWith all three landed (or workable env vars set), a 282 MiB git clone completes through the proxy in ~1m30s end-to-end — same throughput as direct.
Opening as a draft. Happy to adjust the default (the prior 100 MiB → 1 GiB is a 10× bump; some operators may want a smaller bump like 256 MiB or 512 MiB), the env-var name, or the suffix grammar.