feat: cloud-ready deployment with HTTP transport (v1.2.0)#34
Merged
Conversation
…upload - Stage 1: Environment variable configuration (PDF_ANALYZER_PROVIDER, PDF_ANALYZER_MODEL, PDF_ANALYZER_API_KEY) with precedence over OS keychain - Stage 2: Google Vertex AI provider using ADC via @ai-sdk/google-vertex, shared Google logic extracted to google-shared.ts - Stage 3: Anthropic Vertex AI provider via @ai-sdk/google-vertex/anthropic subpath - Stage 4: Streamable HTTP transport (POST /mcp, GET /health) when PORT env var is set - Stage 5: upload_pdf tool (HTTP mode only) for GCS uploads, gs:// URL support in classifySource - Stage 6: Dockerfile for Cloud Run deployment (node:22-slim, prod deps only) - Stage 7: E2E test suite for Cloud Run (test/test-e2e-cloud-run.ts) - Lazy dynamic imports for cloud-only packages in stdio mode - Fix pre-existing test/test-e2e-oversized-doc.ts for new multi-provider API Closes #29, #30, #31, #32, #33
- Multi-stage Dockerfile for Cloud Build (dist/ is gitignored) - Vertex AI provider uses inline bytes instead of File API - Authenticated GCS downloads for gs:// URIs via @google-cloud/storage - Remove upload_pdf tool and storage module (analyze_pdf handles URLs and gs:// directly) - Deploy scripts: gcloud CLI (deploy/gcloud.sh) and Terraform (deploy/main.tf) - Config-file approach: deploy/env for gcloud, deploy/terraform.tfvars for Terraform - Optional bring-your-own service account support - Deployment guide in deploy/README.md - E2E tests updated: no upload_pdf, multi-query test, no GCS leftovers - Downgrade SA role from storage.objectAdmin to storage.objectViewer - Validation scorecard updated to 41/41 PASS
Large PDFs (100+ pages) sent inline to Vertex AI may require chunking into multiple sequential API calls, exceeding the default 5min timeout.
Large PDFs (e.g., 17MB nRF52840 datasheet) from slow CDNs can exceed the 60-second download timeout, causing "Request timed out" errors.
Add a direct POST /analyze REST endpoint alongside the MCP /mcp endpoint for clients that don't speak MCP or hit MCP client timeout limits. Bump Cloud Run memory from 1GiB to 4GiB to handle large PDFs (the nRF52840 spec caused OOM at 1GiB). Add architecture docs explaining stdio vs HTTP transport modes.
This was referenced Apr 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
POST /mcp,POST /analyze,GET /health)gs://URIsupload_pdftool (unnecessary in both stdio and HTTP modes)Closes #29, closes #30, closes #31, closes #32, closes #33
Test plan
/analyzeendpointgs://URI authentication verified on Cloud Rungcloud.shscript verified