Skip to content

Phase 3a follow-up: MCP server for authoring tools #3

@gregoryfoster

Description

@gregoryfoster

Migrated from CannObserv/watcher#147 during the Archiver service extraction (CannObserv/watcher#149) on 2026-05-07. Paths in the body have been updated to reflect the new archiver repo layout.

Context

Phase 3a (CannObserv/watcher#145) ships authoring tools as REST + Python SDK. The corresponding MCP server (per the plan, Task 11) is split into this dedicated follow-up so the first Phase 3a slice stays focused on the API surface.

Why MCP matters

Claude (in Claude Code, Claude Desktop, and the API directly) discovers tools via the Model Context Protocol. Without an MCP server, every Claude consumer has to hand-derive tool descriptors from OpenAPI — losing the "when to use" hints, safety annotations, and concrete examples that make tool selection reliable.

What to build

tools/archiver_mcp_server.py — a stdio MCP server that wraps the existing ArchiverClient SDK. Tools to expose: validate_info_spec, find_info_item, fetch_and_render, preview_extraction, propose_selectors, create_info_item (with optional initial_info_spec), create_info_spec, get_info_item, list_info_items, get_primary_info_spec, list_active_info_specs, patch_info_spec.

Best practices to encode:

  • Descriptions answer "when to use", not "what it does" — explain intent + flow position + what comes next.
  • Concrete examples in each tool's descriptor, wired to fixture URLs from scripts/smoke_phase3a.sh so they don't rot.
  • Safety annotations: spec names — readOnlyHint, destructiveHint, idempotentHint, openWorldHint — not ad-hoc keys (see refinement Phase 3a follow-up: propose_selectors for non-CSS extraction algorithms #2 below).
  • Structured errors: every tool returns either a successful result or {"error": "<code>", "details": {...}} — error codes validation_failed, target_unreachable, not_found, auth_error, server_error. Lets Claude self-recover.
  • Configuration via env: ARCHIVER_BASE_URL + ARCHIVER_API_KEY; no extra secrets.

Tasks

Refinements (review pass, 2026-05-06)

Notes from a review of this issue against the Phase 3a plan and the existing SDK. The two biggest calls are #2 and #4 — they determine whether Claude can reliably self-recover.

1. Pin mcp SDK + use FastMCP

Replace the placeholder mcp>=… with a concrete pin (mcp>=1.2,<2). Use the FastMCP high-level API — less boilerplate, decorator-driven tool registration, lifespan hooks for the shared client. Drop to the low-level Server only if FastMCP can't carry the annotations needed.

2. Use the official MCP annotation names, not ad-hoc keys

The MCP spec defines readOnlyHint, destructiveHint, idempotentHint, openWorldHint. Match the spec — Claude Code already special-cases those names when deciding whether to prompt the user. Suggested mapping:

Tool readOnly destructive idempotent openWorld
validate_info_spec
find_info_item, get_info_item, list_info_items, get_primary_info_spec, list_active_info_specs
fetch_and_render, propose_selectors, preview_extraction ✓ (network)
create_info_item, create_info_spec
patch_info_spec ✓ (deactivation breaks consumers)

3. Reuse one ArchiverClient across the process lifetime

Stdio MCP = one subprocess per Claude session. Instantiate the client in the FastMCP lifespan startup hook, aclose() on shutdown. The 60 s _primary_cache then warms across tool calls. Don't open a new client per tool invocation.

4. Preserve the structured error codes the SDK already returns

The Phase 3a routes already emit validation_failed / target_unreachable in 422 detail bodies. Don't re-stringify — write a single _to_mcp_error(exc) helper that inspects the ArchiverClientError subclass and parses detail.code when present, falling back to a status→code map. Add a parametrized test that walks every error subclass.

5. Tool descriptions: chain them via "next step"

Each tool's description names the prior and next tool in the canonical flow. The smoke walks: find_info_itemfetch_and_renderpropose_selectorspreview_extractionvalidate_info_speccreate_info_item(initial_info_spec=…)get_primary_info_spec. Each description references its neighbours so Claude doesn't have to infer the graph. Add a server-level description that lays out the full flow once.

6. Wire descriptor examples to the smoke fixtures

Hard-code https://example.com/ + "Example Domain" in the example payloads — the same fixtures scripts/smoke_phase3a.sh uses. Smoke step introspects the MCP server, parses each tool's example, and asserts non-empty description + inputSchema + at least one annotation. Examples and smoke stay in lockstep.

7. Logging to stderr, not stdout

Stdio is the MCP transport — anything on stdout corrupts the protocol. Call configure_logging() once in main() and assert explicitly that handlers target stderr; log a startup line so a misconfig is loud.

8. Trim the tool list — both find_info_item and list_info_items?

Today find_info_item rejects empty q (422), so list_info_items isn't redundant. Keep both, but make the descriptions clearly differentiate (search vs. enumerate) or Claude will pick wrong.

9. File path: tools/archiver_mcp_server.py is fine

Matches the precedent at tools/info_changes_consumer.py. Keeps src/ reserved for the FastAPI app + core logic.

10. Sibling follow-ups worth filing

  • Watcher MCP server (create_watch, test_watch) — symmetric piece on the monitoring side; plan mentions it as out of scope and no tracking issue exists.
  • Combined vs. separate MCP servers when Archive (3b) lands — recommend separate (smaller blast radius), but call it out so the next planner doesn't have to re-decide.

11. Test the descriptor shape, not just behaviour

Parametrize one test over all registered tools, asserting:

  • non-empty description (≥ 50 chars — forces "when to use", not just the verb)
  • presence of all four MCP annotations
  • at least one example in the input schema
  • mutating tools have readOnlyHint=False

Catches the silent misclassification class of bug.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions