Skip to content

v1.2.3: Opus 4.7 default, private Cloud Run, generalized deploy, chunking bytes fix#39

Merged
valentinozegna merged 9 commits into
mainfrom
experiment/opus-4-7-vertex
Apr 18, 2026
Merged

v1.2.3: Opus 4.7 default, private Cloud Run, generalized deploy, chunking bytes fix#39
valentinozegna merged 9 commits into
mainfrom
experiment/opus-4-7-vertex

Conversation

@valentinozegna
Copy link
Copy Markdown
Contributor

Summary

  • Adds claude-opus-4-7 as the new default flagship for both Anthropic providers (anthropic, anthropic-vertex); claude-opus-4-6 remains selectable.
  • Generalizes Cloud Run deployment to cover all five providers (Vertex + direct-API) from a single PDF_ANALYZER_PROVIDER knob, with Secret Manager wiring for direct-API keys.
  • Makes Cloud Run private by default (--no-allow-unauthenticated) and fixes the /mcp route so it handles all HTTP methods (previously GET was 404'ing and clients misreported "SDK auth failed").
  • Fixes a crash where gs:// PDFs that needed chunking fell through the bytes path into validateLocalPath(undefined); the chunking branch now has an exhaustive resolveSourceBytes helper that TypeScript enforces.
  • Closes test coverage gaps: HTTP transport tests now run against the real exported createRequestHandler (previously against an inline copy); resolveSourceBytes has regression tests for the bytes crash case.
  • Slims Cloud Build context 160× via allowlist .gcloudignore / .dockerignore (~20 MiB → ~120 KiB).
  • Hardens .gitignore against common credential, env, and build-cache leaks.

See CHANGELOG.md for the full list.

Test plan

  • npm run type-check && npm run lint && npm test (93 tests pass, was 89)
  • Deployed end-to-end on Cloud Run with each auth mode exercised this session:
    • anthropic-vertex + Opus 4.7 (blocked only by Vertex marketplace quota, which is a GCP concern)
    • anthropic + direct API key from Secret Manager + Opus 4.7 → successful analyze_pdf on 1-pager (verified verbatim transcription and token usage in Anthropic console)
    • google + Gemini direct API key from Secret Manager + Gemini 3.1 Pro → successful analyze_pdf on 18.5 MB / 980-page oversized-doc.pdf via the /analyze REST endpoint, chunked into 2 pieces with rolling findings
  • Verified the chunking-bytes bug before/after fix: pre-fix crashed at validateLocalPath(undefined) on first fallback; post-fix completes

🤖 Generated with Claude Code

valentinozegna and others added 9 commits April 18, 2026 15:48
Adds claude-opus-4-7 to the MODELS list for both the direct anthropic
provider and anthropic-vertex, and flips DEFAULT_MODEL so new callers
pick up the latest flagship automatically. opus-4-6 stays selectable
via PDF_ANALYZER_MODEL for anyone pinned to the previous flagship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Streamable HTTP transport requires GET /mcp (for the server→client
SSE channel) in addition to POST. Previously the HTTP server matched
only POST /mcp and returned a bare 404 to GET, which the MCP SDK client
misparsed as an OAuth discovery response and surfaced as "SDK auth
failed: HTTP 404". Route any method on /mcp to
StreamableHTTPServerTransport.handleRequest so the SDK decides the
per-method behavior (405 in stateless mode is fine).

Deploy now uses --no-allow-unauthenticated so the service is IAM-private
by default. Health check post-deploy now sends an ADC identity token.
Local access pattern is `gcloud run services proxy pdf-analyzer
--port=8080`; .mcp.json points at localhost and the proxy injects fresh
ID tokens per request.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous deploys uploaded 20 MiB / 71 files to Cloud Build (test fixtures,
deploy/.terraform, compiled binaries in bin/, etc.) because there was no
.gcloudignore so gcloud fell back to .gitignore. The Dockerfile only uses
src/, package.json, tsconfig.json. Allowlist exactly that — the upload
drops to ~120 KiB / 32 files, and Docker layer caching becomes sane.

Also gitignore the .playwright-mcp/ artifact directory so browser
automation snapshots don't get accidentally staged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the hardcoded google-vertex path with a provider-aware deploy
that supports all five providers the codebase already implements:

  Vertex (ADC):     google-vertex, anthropic-vertex
  Direct API key:   google, anthropic, openai

Users pick a provider in deploy/env (PDF_ANALYZER_PROVIDER) or
terraform.tfvars (provider_id); the deploy handles the rest.

- Vertex path: enables aiplatform.googleapis.com, grants
  roles/aiplatform.user + roles/storage.objectViewer to the SA, sets
  VERTEX_PROJECT + VERTEX_LOCATION env vars.

- API-key path: enables secretmanager.googleapis.com, expects a
  user-created Secret Manager secret (name in API_KEY_SECRET_NAME /
  api_key_secret_name), grants roles/secretmanager.secretAccessor
  on that specific secret, and injects it at runtime via
  --set-secrets=PDF_ANALYZER_API_KEY=<name>:latest. The script fails
  fast with the exact `gcloud secrets create` command if the secret
  doesn't exist; it never creates secrets or touches key material.

Optional PDF_ANALYZER_MODEL pins a specific model per provider;
otherwise the provider default is used.

The Terraform config drops the unconditional allUsers run.invoker
binding that was making every deploy public by default. The service
is deployed --no-allow-unauthenticated; callers use
`gcloud run services proxy` (documented in the README and project
CLAUDE.md) to mint fresh ID tokens per request.

All identifiers in tracked files are generic; the README has a
provider x auth matrix and the exact commands per path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Defense-in-depth for an open-source repo:

- Broaden env file coverage (.env.production, .env.development, etc.)
- Catch common credential filenames (service-account JSON keys, GCP keys)
- Ignore *.tsbuildinfo so TypeScript incremental build cache never lands

None of these are known to be committed; this is preventative so a
contributor can't accidentally push a key.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
src/service.ts: extract resolveSourceBytes helper with an exhaustive
switch over the three non-cached PdfSource kinds. The previous chunking
fallback only handled path/url and cast everything else to path, so
gs:// inputs (which arrive as kind:"bytes" after downloadFromGcs) hit
validateLocalPath(undefined) and crashed on .trim(). The exhaustive
switch is now TypeScript-enforced so a new source kind cannot silently
miss this path again.

src/transports/http.ts: factor the request handler out of startHttpServer
into an exported createRequestHandler. The existing http.test.ts kept a
copy-pasted handler inline that was stale relative to production (it
still matched POST /mcp only), so the GET /mcp routing fix from an
earlier commit would not have been caught by the test. Tests now drive
the real production handler.

src/service.test.ts: regression tests for resolveSourceBytes covering
the kind:"bytes" identity case (the one that was crashing) and
kind:"path" (guard against refactor regressions).

src/transports/http.test.ts: regression tests asserting GET /mcp and
DELETE /mcp do not fall through to the 404 branch.
@valentinozegna valentinozegna merged commit 3acc6b3 into main Apr 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant