Accept base64-encoded PDF bytes in analyze_pdf tool

# Add upload_pdf tool for remote MCP deployments

## Summary

Add a new `upload_pdf` MCP tool that accepts PDF bytes from the client, uploads them to a GCS bucket on the server side, and returns a GCS URL. This URL can then be passed to `analyze_pdf` as `pdf_source`. This enables local-file workflows when the MCP server is deployed remotely (e.g., on Cloud Run).

**Crucially, this tool is only registered in HTTP mode.** In stdio mode (local), `analyze_pdf` reads files directly — `upload_pdf` is not needed and should not be visible to the client.

## Motivation

When the MCP server runs locally (stdio), local file paths work — the server reads the file directly. But when deployed as a remote HTTP service, the server has no access to the client's filesystem.

Rather than requiring users to manually upload PDFs to GCS before analysis, the MCP server should handle the upload itself. The workflow becomes:

1. MCP client reads the local file and calls `upload_pdf` with the bytes
2. Server uploads to GCS, returns a URL
3. Client passes that URL to `analyze_pdf`

This is transparent to the user — the MCP client orchestrates both calls automatically.

## Proposed changes

### 1. Transport-aware tool registration in `src/server.ts`

`createServer()` should accept a `mode` parameter (`"stdio" | "http"`) that controls which tools are registered and how they are described:

```typescript
export const createServer = (mode: "stdio" | "http" = "stdio") => {
  const server = new McpServer({ ... });

  // Always registered — description varies by mode
  server.registerTool("analyze_pdf", {
    description: mode === "http"
      ? "Analyze a PDF document using AI. Provide a URL, cached file URI, " +
        "or a GCS URL from upload_pdf. For local files, call upload_pdf first."
      : "Analyze a PDF document using AI. Provide an absolute file path, URL, " +
        "or cached file URI (Google only).",
    inputSchema: { ... },
  }, async ({ pdf_source, queries }) => { ... });

  // Only registered in HTTP mode
  if (mode === "http") {
    server.registerTool("upload_pdf", { ... }, async ({ pdf_data, filename }) => { ... });
  }

  return server;
};
```

Then in `runServer()`:

```typescript
export const runServer = async () => {
  const port = process.env.PORT;

  if (port) {
    startHttpServer(() => createServer("http"), parseInt(port, 10));
  } else {
    const server = createServer("stdio");
    const transport = new StdioServerTransport();
    await server.connect(transport);
  }
};
```

### 2. New tool: `upload_pdf` (HTTP mode only)

```typescript
server.registerTool("upload_pdf", {
  description:
    "Upload a PDF to cloud storage for analysis. Returns a URL that can be " +
    "passed to analyze_pdf. Use this for local files when the server is remote.",
  inputSchema: {
    pdf_data: z
      .string()
      .describe("Base64-encoded PDF file contents"),
    filename: z
      .string()
      .optional()
      .describe("Optional original filename (used for naming in storage)"),
  },
}, async ({ pdf_data, filename }) => {
  const bytes = Buffer.from(pdf_data, "base64");
  const name = filename || `upload-${Date.now()}.pdf`;
  const url = await uploadToGcs(bytes, name);
  return {
    content: [{
      type: "text",
      text: JSON.stringify({ url, filename: name }),
    }],
  };
});
```

### 2. New file: `src/storage.ts`

GCS upload logic using `@google-cloud/storage`:

```typescript
import { Storage } from "@google-cloud/storage";

const BUCKET = process.env.PDF_UPLOAD_BUCKET;

export async function uploadToGcs(
  data: Buffer,
  filename: string,
): Promise<string> {
  if (!BUCKET) {
    throw new Error(
      "PDF_UPLOAD_BUCKET env var is required for upload_pdf. " +
      "Set it to a GCS bucket name.",
    );
  }

  const storage = new Storage(); // uses ADC
  const bucket = storage.bucket(BUCKET);
  const key = `uploads/${Date.now()}-${filename}`;
  const file = bucket.file(key);

  await file.save(data, { contentType: "application/pdf" });

  return `gs://${BUCKET}/${key}`;
}
```

### 3. New dependency

```
@google-cloud/storage
```

### 4. Update `classifySource` in `src/service.ts`

Handle `gs://` URLs:

```typescript
function classifySource(source: string): PdfSource {
  if (source.startsWith("gs://")) {
    // Convert to HTTPS URL for fetching
    const withoutPrefix = source.slice(5);
    const [bucket, ...rest] = withoutPrefix.split("/");
    const objectPath = rest.join("/");
    const url = `https://storage.googleapis.com/${bucket}/${objectPath}`;
    return { kind: "url", url };
  }

  // ... existing logic ...
}
```

### 5. Update server instructions

Add `upload_pdf` to `SERVER_INSTRUCTIONS` so MCP clients know to use it when dealing with local files and a remote server.

## Environment variables

| Variable | Required | Description |
|----------|----------|-------------|
| `PDF_UPLOAD_BUCKET` | Only when `upload_pdf` is used | GCS bucket name for PDF uploads |

## Required IAM roles

The service account needs **Storage Admin** (or `storage.objects.create` + `storage.objects.get`) on the upload bucket. This is already required by the Google Vertex AI provider for the File API.

## What the MCP client sees

**Stdio mode (local):**
- `analyze_pdf` — accepts file paths, URLs, cached URIs
- No `upload_pdf` tool visible

**HTTP mode (remote):**
- `analyze_pdf` — accepts URLs, cached URIs, GCS URLs. Description tells client to use `upload_pdf` for local files.
- `upload_pdf` — accepts base64 PDF bytes, returns GCS URL

The client doesn't guess — it sees the right tools and descriptions for the server it's connected to.

## Workflow (HTTP mode)

```
User: "Analyze /Users/me/docs/report.pdf"

MCP Client:
  1. Sees upload_pdf is available → knows server is remote
  2. Reads /Users/me/docs/report.pdf locally
  3. Calls upload_pdf with base64-encoded bytes
  4. Gets back { url: "gs://bucket/uploads/123-report.pdf" }
  5. Calls analyze_pdf with pdf_source = "gs://bucket/uploads/123-report.pdf"
  6. Returns analysis results to user
```

## Workflow (stdio mode)

```
User: "Analyze /Users/me/docs/report.pdf"

MCP Client:
  1. No upload_pdf tool → server is local
  2. Calls analyze_pdf with pdf_source = "/Users/me/docs/report.pdf"
  3. Server reads file directly
  4. Returns analysis results to user
```

## What stays the same

- `analyze_pdf` core logic is unchanged — it already handles URLs
- Local file paths still work in stdio mode (direct filesystem access)
- All provider logic, chunking, and caching are unaffected

## Backward compatibility

- **No breaking changes.** `upload_pdf` only exists in HTTP mode.
- Stdio mode behaves exactly as it does today — no new tools, no changed descriptions.
- Servers without `PDF_UPLOAD_BUCKET` configured return a clear error if `upload_pdf` is called.

## Cleanup considerations

Uploaded PDFs accumulate in the bucket. Consider:
- Setting a [lifecycle rule](https://cloud.google.com/storage/docs/lifecycle) on the bucket to auto-delete objects after N days
- Or deleting uploaded files after analysis completes


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accept base64-encoded PDF bytes in analyze_pdf tool #33

Add upload_pdf tool for remote MCP deployments

Summary

Motivation

Proposed changes

1. Transport-aware tool registration in `src/server.ts`

2. New tool: `upload_pdf` (HTTP mode only)

2. New file: `src/storage.ts`

3. New dependency

4. Update `classifySource` in `src/service.ts`

5. Update server instructions

Environment variables

Required IAM roles

What the MCP client sees

Workflow (HTTP mode)

Workflow (stdio mode)

What stays the same

Backward compatibility

Cleanup considerations

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Accept base64-encoded PDF bytes in analyze_pdf tool #33

Description

Add upload_pdf tool for remote MCP deployments

Summary

Motivation

Proposed changes

1. Transport-aware tool registration in src/server.ts

2. New tool: upload_pdf (HTTP mode only)

2. New file: src/storage.ts

3. New dependency

4. Update classifySource in src/service.ts

5. Update server instructions

Environment variables

Required IAM roles

What the MCP client sees

Workflow (HTTP mode)

Workflow (stdio mode)

What stays the same

Backward compatibility

Cleanup considerations

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. Transport-aware tool registration in `src/server.ts`

2. New tool: `upload_pdf` (HTTP mode only)

2. New file: `src/storage.ts`

4. Update `classifySource` in `src/service.ts`