Skip to content

feat: cloud-ready deployment with HTTP transport (v1.2.0)#34

Merged
valentinozegna merged 10 commits into
mainfrom
feature/cloud-ready
Apr 7, 2026
Merged

feat: cloud-ready deployment with HTTP transport (v1.2.0)#34
valentinozegna merged 10 commits into
mainfrom
feature/cloud-ready

Conversation

@valentinozegna
Copy link
Copy Markdown
Contributor

Summary

  • Add Streamable HTTP transport for Cloud Run deployment (POST /mcp, POST /analyze, GET /health)
  • Add Google Vertex AI and Anthropic Vertex AI providers (ADC auth, no API keys)
  • Add provider/model configuration via environment variables
  • Add authenticated GCS downloads for gs:// URIs
  • Add deployment scripts (gcloud CLI + Terraform) and documentation
  • Remove upload_pdf tool (unnecessary in both stdio and HTTP modes)
  • Increase PDF fetch timeout to 5 minutes, Cloud Run memory to 4 GiB
  • Add architecture docs explaining stdio vs HTTP transport

Closes #29, closes #30, closes #31, closes #32, closes #33

Test plan

  • HTTP MCP tested with small PDFs (BMP280 datasheet) via MCP client
  • HTTP MCP tested with large PDFs (nRF52840 spec, 500+ pages) via /analyze endpoint
  • gs:// URI authentication verified on Cloud Run
  • Stdio MCP still works locally
  • Type check passes
  • Deployment via gcloud.sh script verified

…upload

- Stage 1: Environment variable configuration (PDF_ANALYZER_PROVIDER, PDF_ANALYZER_MODEL, PDF_ANALYZER_API_KEY) with precedence over OS keychain
- Stage 2: Google Vertex AI provider using ADC via @ai-sdk/google-vertex, shared Google logic extracted to google-shared.ts
- Stage 3: Anthropic Vertex AI provider via @ai-sdk/google-vertex/anthropic subpath
- Stage 4: Streamable HTTP transport (POST /mcp, GET /health) when PORT env var is set
- Stage 5: upload_pdf tool (HTTP mode only) for GCS uploads, gs:// URL support in classifySource
- Stage 6: Dockerfile for Cloud Run deployment (node:22-slim, prod deps only)
- Stage 7: E2E test suite for Cloud Run (test/test-e2e-cloud-run.ts)
- Lazy dynamic imports for cloud-only packages in stdio mode
- Fix pre-existing test/test-e2e-oversized-doc.ts for new multi-provider API

Closes #29, #30, #31, #32, #33
- Multi-stage Dockerfile for Cloud Build (dist/ is gitignored)
- Vertex AI provider uses inline bytes instead of File API
- Authenticated GCS downloads for gs:// URIs via @google-cloud/storage
- Remove upload_pdf tool and storage module (analyze_pdf handles URLs and gs:// directly)
- Deploy scripts: gcloud CLI (deploy/gcloud.sh) and Terraform (deploy/main.tf)
- Config-file approach: deploy/env for gcloud, deploy/terraform.tfvars for Terraform
- Optional bring-your-own service account support
- Deployment guide in deploy/README.md
- E2E tests updated: no upload_pdf, multi-query test, no GCS leftovers
- Downgrade SA role from storage.objectAdmin to storage.objectViewer
- Validation scorecard updated to 41/41 PASS
Large PDFs (100+ pages) sent inline to Vertex AI may require chunking
into multiple sequential API calls, exceeding the default 5min timeout.
Large PDFs (e.g., 17MB nRF52840 datasheet) from slow CDNs can exceed
the 60-second download timeout, causing "Request timed out" errors.
Add a direct POST /analyze REST endpoint alongside the MCP /mcp endpoint
for clients that don't speak MCP or hit MCP client timeout limits. Bump
Cloud Run memory from 1GiB to 4GiB to handle large PDFs (the nRF52840
spec caused OOM at 1GiB). Add architecture docs explaining stdio vs HTTP
transport modes.
@valentinozegna valentinozegna merged commit b016755 into main Apr 7, 2026
2 checks passed
@valentinozegna valentinozegna deleted the feature/cloud-ready branch April 7, 2026 03:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant