Add upload_pdf tool for remote MCP deployments
Summary
Add a new upload_pdf MCP tool that accepts PDF bytes from the client, uploads them to a GCS bucket on the server side, and returns a GCS URL. This URL can then be passed to analyze_pdf as pdf_source. This enables local-file workflows when the MCP server is deployed remotely (e.g., on Cloud Run).
Crucially, this tool is only registered in HTTP mode. In stdio mode (local), analyze_pdf reads files directly — upload_pdf is not needed and should not be visible to the client.
Motivation
When the MCP server runs locally (stdio), local file paths work — the server reads the file directly. But when deployed as a remote HTTP service, the server has no access to the client's filesystem.
Rather than requiring users to manually upload PDFs to GCS before analysis, the MCP server should handle the upload itself. The workflow becomes:
- MCP client reads the local file and calls
upload_pdf with the bytes
- Server uploads to GCS, returns a URL
- Client passes that URL to
analyze_pdf
This is transparent to the user — the MCP client orchestrates both calls automatically.
Proposed changes
1. Transport-aware tool registration in src/server.ts
createServer() should accept a mode parameter ("stdio" | "http") that controls which tools are registered and how they are described:
export const createServer = (mode: "stdio" | "http" = "stdio") => {
const server = new McpServer({ ... });
// Always registered — description varies by mode
server.registerTool("analyze_pdf", {
description: mode === "http"
? "Analyze a PDF document using AI. Provide a URL, cached file URI, " +
"or a GCS URL from upload_pdf. For local files, call upload_pdf first."
: "Analyze a PDF document using AI. Provide an absolute file path, URL, " +
"or cached file URI (Google only).",
inputSchema: { ... },
}, async ({ pdf_source, queries }) => { ... });
// Only registered in HTTP mode
if (mode === "http") {
server.registerTool("upload_pdf", { ... }, async ({ pdf_data, filename }) => { ... });
}
return server;
};
Then in runServer():
export const runServer = async () => {
const port = process.env.PORT;
if (port) {
startHttpServer(() => createServer("http"), parseInt(port, 10));
} else {
const server = createServer("stdio");
const transport = new StdioServerTransport();
await server.connect(transport);
}
};
2. New tool: upload_pdf (HTTP mode only)
server.registerTool("upload_pdf", {
description:
"Upload a PDF to cloud storage for analysis. Returns a URL that can be " +
"passed to analyze_pdf. Use this for local files when the server is remote.",
inputSchema: {
pdf_data: z
.string()
.describe("Base64-encoded PDF file contents"),
filename: z
.string()
.optional()
.describe("Optional original filename (used for naming in storage)"),
},
}, async ({ pdf_data, filename }) => {
const bytes = Buffer.from(pdf_data, "base64");
const name = filename || `upload-${Date.now()}.pdf`;
const url = await uploadToGcs(bytes, name);
return {
content: [{
type: "text",
text: JSON.stringify({ url, filename: name }),
}],
};
});
2. New file: src/storage.ts
GCS upload logic using @google-cloud/storage:
import { Storage } from "@google-cloud/storage";
const BUCKET = process.env.PDF_UPLOAD_BUCKET;
export async function uploadToGcs(
data: Buffer,
filename: string,
): Promise<string> {
if (!BUCKET) {
throw new Error(
"PDF_UPLOAD_BUCKET env var is required for upload_pdf. " +
"Set it to a GCS bucket name.",
);
}
const storage = new Storage(); // uses ADC
const bucket = storage.bucket(BUCKET);
const key = `uploads/${Date.now()}-${filename}`;
const file = bucket.file(key);
await file.save(data, { contentType: "application/pdf" });
return `gs://${BUCKET}/${key}`;
}
3. New dependency
4. Update classifySource in src/service.ts
Handle gs:// URLs:
function classifySource(source: string): PdfSource {
if (source.startsWith("gs://")) {
// Convert to HTTPS URL for fetching
const withoutPrefix = source.slice(5);
const [bucket, ...rest] = withoutPrefix.split("/");
const objectPath = rest.join("/");
const url = `https://storage.googleapis.com/${bucket}/${objectPath}`;
return { kind: "url", url };
}
// ... existing logic ...
}
5. Update server instructions
Add upload_pdf to SERVER_INSTRUCTIONS so MCP clients know to use it when dealing with local files and a remote server.
Environment variables
| Variable |
Required |
Description |
PDF_UPLOAD_BUCKET |
Only when upload_pdf is used |
GCS bucket name for PDF uploads |
Required IAM roles
The service account needs Storage Admin (or storage.objects.create + storage.objects.get) on the upload bucket. This is already required by the Google Vertex AI provider for the File API.
What the MCP client sees
Stdio mode (local):
analyze_pdf — accepts file paths, URLs, cached URIs
- No
upload_pdf tool visible
HTTP mode (remote):
analyze_pdf — accepts URLs, cached URIs, GCS URLs. Description tells client to use upload_pdf for local files.
upload_pdf — accepts base64 PDF bytes, returns GCS URL
The client doesn't guess — it sees the right tools and descriptions for the server it's connected to.
Workflow (HTTP mode)
User: "Analyze /Users/me/docs/report.pdf"
MCP Client:
1. Sees upload_pdf is available → knows server is remote
2. Reads /Users/me/docs/report.pdf locally
3. Calls upload_pdf with base64-encoded bytes
4. Gets back { url: "gs://bucket/uploads/123-report.pdf" }
5. Calls analyze_pdf with pdf_source = "gs://bucket/uploads/123-report.pdf"
6. Returns analysis results to user
Workflow (stdio mode)
User: "Analyze /Users/me/docs/report.pdf"
MCP Client:
1. No upload_pdf tool → server is local
2. Calls analyze_pdf with pdf_source = "/Users/me/docs/report.pdf"
3. Server reads file directly
4. Returns analysis results to user
What stays the same
analyze_pdf core logic is unchanged — it already handles URLs
- Local file paths still work in stdio mode (direct filesystem access)
- All provider logic, chunking, and caching are unaffected
Backward compatibility
- No breaking changes.
upload_pdf only exists in HTTP mode.
- Stdio mode behaves exactly as it does today — no new tools, no changed descriptions.
- Servers without
PDF_UPLOAD_BUCKET configured return a clear error if upload_pdf is called.
Cleanup considerations
Uploaded PDFs accumulate in the bucket. Consider:
- Setting a lifecycle rule on the bucket to auto-delete objects after N days
- Or deleting uploaded files after analysis completes
Add upload_pdf tool for remote MCP deployments
Summary
Add a new
upload_pdfMCP tool that accepts PDF bytes from the client, uploads them to a GCS bucket on the server side, and returns a GCS URL. This URL can then be passed toanalyze_pdfaspdf_source. This enables local-file workflows when the MCP server is deployed remotely (e.g., on Cloud Run).Crucially, this tool is only registered in HTTP mode. In stdio mode (local),
analyze_pdfreads files directly —upload_pdfis not needed and should not be visible to the client.Motivation
When the MCP server runs locally (stdio), local file paths work — the server reads the file directly. But when deployed as a remote HTTP service, the server has no access to the client's filesystem.
Rather than requiring users to manually upload PDFs to GCS before analysis, the MCP server should handle the upload itself. The workflow becomes:
upload_pdfwith the bytesanalyze_pdfThis is transparent to the user — the MCP client orchestrates both calls automatically.
Proposed changes
1. Transport-aware tool registration in
src/server.tscreateServer()should accept amodeparameter ("stdio" | "http") that controls which tools are registered and how they are described:Then in
runServer():2. New tool:
upload_pdf(HTTP mode only)2. New file:
src/storage.tsGCS upload logic using
@google-cloud/storage:3. New dependency
4. Update
classifySourceinsrc/service.tsHandle
gs://URLs:5. Update server instructions
Add
upload_pdftoSERVER_INSTRUCTIONSso MCP clients know to use it when dealing with local files and a remote server.Environment variables
PDF_UPLOAD_BUCKETupload_pdfis usedRequired IAM roles
The service account needs Storage Admin (or
storage.objects.create+storage.objects.get) on the upload bucket. This is already required by the Google Vertex AI provider for the File API.What the MCP client sees
Stdio mode (local):
analyze_pdf— accepts file paths, URLs, cached URIsupload_pdftool visibleHTTP mode (remote):
analyze_pdf— accepts URLs, cached URIs, GCS URLs. Description tells client to useupload_pdffor local files.upload_pdf— accepts base64 PDF bytes, returns GCS URLThe client doesn't guess — it sees the right tools and descriptions for the server it's connected to.
Workflow (HTTP mode)
Workflow (stdio mode)
What stays the same
analyze_pdfcore logic is unchanged — it already handles URLsBackward compatibility
upload_pdfonly exists in HTTP mode.PDF_UPLOAD_BUCKETconfigured return a clear error ifupload_pdfis called.Cleanup considerations
Uploaded PDFs accumulate in the bucket. Consider: