Skip to content

aak204/Tool-Catalog-Compiler

Repository files navigation

toolc

English | Русский

License: MIT Language: Go Platform: cross-platform Default mode: flat OpenAPI HTTP: beta MCP stdio: alpha

Compile noisy OpenAPI and MCP tool surfaces into one smaller governed runtime surface for LLM agents.

Validated today with real OpenRouter model runs, real saved benchmark artifacts, and a live OpenAPI invoke check.

Best fit today:

  • VS Code users with too many MCP servers
  • OpenAPI or MCP-heavy agent stacks
  • teams that want less exposed tool surface without a platform rewrite

Install

Two supported paths for v1.0.0:

  1. download a prebuilt archive from GitHub Releases
  2. install from source with Go
go install github.com/aak204/Tool-Catalog-Compiler/cmd/toolc@v1.0.0

After the v1.0.0 tag is published as the latest tag, this also works:

go install github.com/aak204/Tool-Catalog-Compiler/cmd/toolc@latest

Release assets for v1.0.0:

  • toolc_v1.0.0_windows_amd64.zip
  • toolc_v1.0.0_linux_amd64.tar.gz
  • toolc_v1.0.0_linux_arm64.tar.gz
  • toolc_v1.0.0_darwin_amd64.tar.gz
  • toolc_v1.0.0_darwin_arm64.tar.gz
  • SHA256SUMS.txt

VS Code Quickstart

If you already use MCP in VS Code, start here.

toolc can take the MCP config you already use, compile it into one smaller controlled surface, and become the single MCP server your agent sees.

Start here:

  1. build toolc
  2. optimize your existing MCP config
  3. point your extension at the emitted config
  4. run one toolc mcp-serve surface instead of many raw servers

Build once:

PowerShell

./scripts/build.ps1

bash

./scripts/build.sh

Common input config paths:

  • VS Code / Codex-style MCP config: .vscode/mcp.json
  • Roo Code project config: .roo/mcp.json
  • Roo Code global config: mcp_settings.json
  • Cline config: cline_mcp_settings.json
  • Kilo config: kilo.jsonc or .kilocode/mcp.json

Optimize an existing MCP config:

PowerShell

./dist/toolc.exe optimize-mcp `
  -input .vscode/mcp.json `
  -emit vscode `
  -out-dir dist/mcp-optimized

bash

./dist/toolc optimize-mcp \
  -input .vscode/mcp.json \
  -emit vscode \
  -out-dir dist/mcp-optimized

What you get:

  • dist/mcp-optimized/toolc.compiled.json
  • dist/mcp-optimized/optimized.json
  • dist/mcp-optimized/optimization-report.json

Use it:

  • point your extension at dist/mcp-optimized/optimized.json
  • the emitted config starts toolc mcp-serve for you
  • toolc becomes the one MCP surface the extension sees

Run the MCP surface directly:

PowerShell

./dist/toolc.exe mcp-serve -catalog dist/mcp-optimized/toolc.compiled.json

bash

./dist/toolc mcp-serve -catalog dist/mcp-optimized/toolc.compiled.json

Platform status:

  • release assets are published for Windows, Linux, and macOS
  • primary exercised path today: Windows + PowerShell
  • bash equivalents are provided for core CLI flows
  • Windows and Linux CLI workflows are documented
  • macOS CLI flows are included in release assets, but are less exercised than the Windows path

Current honest boundary:

  • strongest path today: local MCP stdio servers
  • remote MCP transports are preserved when they cannot yet be honestly absorbed

The default product story is simple:

  • flat is the default runtime path
  • staged is the controlled escalation path
  • direct stays a baseline/debug/control mode
  • the wedge is a smaller, governed runtime surface over existing tools, not a platform rewrite

Not the product claim:

  • a full MCP platform
  • universal backend coverage
  • GA-grade authn/authz or full-spec OpenAPI execution
  • a benchmark suite pretending to be the product

What It Does

  • imports OpenAPI/function/MCP-like inputs into one IR
  • compiles them into a smaller, more stable catalog
  • applies policy/governance before exposure
  • serves that catalog either as a lightweight gateway or as one MCP-compatible surface

The strongest current wedge is MCP optimization for editor users:

  1. import the MCP config you already have
  2. probe what can be safely absorbed
  3. compile and stabilize the surface
  4. emit one toolc MCP server config back to the client

Unsupported remote transports are preserved honestly instead of being falsely claimed as absorbed.

Architecture

flowchart LR
    A[OpenAPI / Function / MCP-like input] --> B[Importers]
    B --> C[IR]
    C --> D[Compiler passes]
    D --> E[Compiled catalog]
    E --> F[Mode selection]
    F --> G1[flat]
    F --> G2[staged]
    F --> G3[precompiled artifact]
    E --> H[Gateway]
    H --> I[compact discovery]
    H --> J[schema lookup]
    H --> K[real OpenAPI HTTP invoke]
    H --> L[live MCP stdio invoke]
Loading

Runtime Semantics

toolc now treats runtime shape selection as an explicit choice:

Mode What it does Choose it when Do not choose it when Latency Token surface Correctness Integration complexity
direct expose the source surface directly baseline/debug, tiny clean catalogs, or benchmark control cases you want a product default for noisy real-world tool surfaces lowest overhead worst weakest lowest
flat one flattened compiled surface or compact discovery wrapper primary runtime path for most real deployments you already know one-shot argument generation is historically weak for the chosen model/provider pair best default much smaller than direct better than direct moderate
staged compiled discovery plus schema lookup fallback controlled escalation when schema complexity, malformed arguments, or risk level justify a second pass you are looking for the default path on every request highest orchestration cost best dynamic shaping strongest moderate

Default recommendation:

  • default runtime path: flat
  • staged escalation: use staged only when schema/risk/reliability signals justify it
  • direct exposure: keep direct for baseline/debug/control, not as the product narrative

Benchmark comparisons treat these as isolated mode semantics. The product workflow can still use flat first and escalate deliberately when needed.

You can ask the CLI directly:

./dist/toolc.exe recommend -catalog testdata/golden/openapi.compiled.json

Default Workflow

1. MCP Optimization Flow

PowerShell

./dist/toolc.exe optimize-mcp `
  -input .vscode/mcp.json `
  -emit vscode `
  -out-dir dist/mcp-optimized

bash

./dist/toolc optimize-mcp \
  -input .vscode/mcp.json \
  -emit vscode \
  -out-dir dist/mcp-optimized

What this does:

  • parses the existing client config
  • probes supported local stdio servers
  • compiles one aggregated catalog
  • emits a replacement config pointing to toolc mcp-serve
  • preserves unsupported remote MCP servers when they cannot be honestly absorbed yet

Then point your client at the emitted config:

PowerShell

./dist/toolc.exe mcp-serve -catalog dist/mcp-optimized/toolc.compiled.json

bash

./dist/toolc mcp-serve -catalog dist/mcp-optimized/toolc.compiled.json

2. Precompile-Only Mode

What you change:

  • add one compile step

What you do not change:

  • your runtime path

PowerShell

./scripts/build.ps1

./dist/toolc.exe compile `
  -importer openapi `
  -source testdata/fixtures/openapi.todo.yaml `
  -out dist/openapi.compiled.json

bash

./scripts/build.sh

./dist/toolc compile \
  -importer openapi \
  -source testdata/fixtures/openapi.todo.yaml \
  -out dist/openapi.compiled.json

3. Gateway Sidecar Mode

What you change:

  • point callers at the gateway instead of the direct catalog

What you do not change:

  • your upstream OpenAPI service

PowerShell

./dist/toolc.exe gateway `
  -config testdata/toolc.config.yaml `
  -catalog dist/openapi.compiled.json

bash

./dist/toolc gateway \
  -config testdata/toolc.config.yaml \
  -catalog dist/openapi.compiled.json

4. CI / Prebuild Mode

What you change:

  • compile the catalog in CI and publish the artifact

What you do not change:

  • your service implementation

PowerShell

./scripts/check.ps1

./dist/toolc.exe compile `
  -config testdata/toolc.config.yaml `
  -importer openapi `
  -source testdata/fixtures/openapi.todo.yaml `
  -out dist/openapi.compiled.json

bash

./scripts/check.sh

./dist/toolc compile \
  -config testdata/toolc.config.yaml \
  -importer openapi \
  -source testdata/fixtures/openapi.todo.yaml \
  -out dist/openapi.compiled.json

5. OpenAPI Runtime Mode

What you change:

  • run toolc in front of a real OpenAPI surface

What you do not change:

  • the upstream API itself

PowerShell

$env:TOOLC_GATEWAY_BACKEND="auto"
$env:TOOLC_GATEWAY_BEARER_TOKEN="***"

./dist/toolc.exe compile `
  -importer openapi `
  -source testdata/fixtures/openapi.openmeteo.yaml `
  -out dist/openmeteo.compiled.json

./dist/toolc.exe gateway `
  -config testdata/toolc.config.yaml `
  -catalog dist/openmeteo.compiled.json

bash

export TOOLC_GATEWAY_BACKEND=auto
export TOOLC_GATEWAY_BEARER_TOKEN='***'

./dist/toolc compile \
  -importer openapi \
  -source testdata/fixtures/openapi.openmeteo.yaml \
  -out dist/openmeteo.compiled.json

./dist/toolc gateway \
  -config testdata/toolc.config.yaml \
  -catalog dist/openmeteo.compiled.json

More detail: docs/integration.md

6. MCP Runtime Mode

What you change:

  • point toolc at an MCP stdio server command

What you do not change:

  • your MCP tool manifest shape
  • the MCP server implementation itself
gateway:
  backend: mcp_stdio
  mcp:
    command: node
    args:
      - path\\to\\your-mcp-server.js

PowerShell

./dist/toolc.exe gateway `
  -config testdata/toolc.config.yaml `
  -catalog dist/mcp.compiled.json `
  -backend mcp_stdio

bash

./dist/toolc gateway \
  -config testdata/toolc.config.yaml \
  -catalog dist/mcp.compiled.json \
  -backend mcp_stdio

Probe a live MCP server before rollout:

PowerShell

./dist/toolc.exe doctor -config dist/mcp-filesystem.config.yaml -probe-backends

bash

./dist/toolc doctor -config dist/mcp-filesystem.config.yaml -probe-backends

Benchmark Snapshot

Benchmarks are here to validate the product thesis, not replace it.

Current saved artifacts are real Windows/amd64 runs from April 13, 2026 using OpenRouter for model-evaluated scenarios plus a live Open-Meteo invoke check for runtime execution.

Scenario Useful signal Honest takeaway
synthetic-10 flat and staged both cut token surface vs direct even a small noisy catalog benefits from a controlled surface
synthetic-50 flat usually stays much cheaper than direct, while staged pays extra orchestration cost medium catalogs are where flat-first vs staged escalation becomes a real tradeoff
github-collaboration-subset flat cuts surface hard and usually improves cost materially the main win is smaller controlled exposure, not universal latency wins on every model
stripe-rest-full-metrics importer/compile now survive the full official Stripe spec instead of blowing the process up large real specs are now bounded by schema expansion guards
openmeteo-runtime-public real external invoke completed with success_rate = 1 and about 302 ms average latency in the latest run real execution is narrow, but it is part of the validated story

Primary benchmark takeaway:

  • flat is the best default product path
  • staged is useful, but should be treated as targeted escalation
  • provider/model behavior still matters enough that benchmark artifacts must stay honest about where wins are not universal

GLM-specific note:

  • z-ai/glm-5.1 currently runs with a model-specific benchmark profile, not the global default
  • current tuned profile: selection steps use a larger completion budget than the global baseline, and argument generation uses a separate lower cap
  • controlled experiment artifacts live in dist/glm-controlled/summary.md

Static Charts

Token proxy by variant:

Token proxy by variant

Projected cost by variant:

Projected cost by variant

Median latency by variant:

Median latency by variant

P95 latency by variant:

P95 latency by variant

Error count by variant:

Error count by variant

Success rate by variant:

Success rate by variant

Compiled-mode break-even:

Compiled break-even

Small vs large catalog behavior:

Catalog scale behavior

Importer and compiler allocations:

Memory by scenario

Regenerate charts from saved artifacts:

./scripts/generate-charts.ps1 `
  -Inputs bench/results/benchmark-results.json,dist/synthetic-50-results/benchmark-results.json `
  -Output docs/assets

More detail: docs/benchmarking.md

CLI Surface

toolc [-quiet] [-verbose] [-log-format text|json] <command> [flags]

Primary commands:

  • toolc optimize-mcp
  • toolc mcp-serve
  • toolc compile
  • toolc gateway
  • toolc doctor
  • toolc recommend
  • toolc bench
  • toolc charts

Examples:

./dist/toolc.exe doctor -config testdata/toolc.config.yaml -catalog dist/openapi.compiled.json
./dist/toolc.exe doctor -config dist/mcp-filesystem.config.yaml -probe-backends
./dist/toolc.exe recommend -catalog dist/openapi.compiled.json
./dist/toolc.exe charts -input bench/results/benchmark-results.json -input dist/synthetic-50-results/benchmark-results.json -out docs/assets
./dist/toolc.exe bench -config testdata/toolc.config.yaml -suite bench/suites/release.yaml -resume -model-workers 4 -model-rps 2

Real Execution Boundary

Real execution is currently narrow by design:

  • real: compiled OpenAPI HTTP tools
  • real: MCP tools through the stdio subset backend
  • mock: explicit testing/dry-run backend
  • not implemented: generic function execution, OpenAI-compatible gateway execution

Current OpenAPI HTTP backend capabilities:

  • shared transport reuse
  • configurable invoke timeout
  • bounded retry/backoff for safe transient failures
  • bearer auth forwarding when imported metadata requires it
  • typed upstream error envelopes with retryability hints

Current MCP stdio backend capabilities:

  • stdio transport
  • initialize plus notifications/initialized
  • tools/call
  • persistent subprocess session reuse
  • real subprocess integration tests
  • clear alpha maturity boundary
  • external validation against official server-everything, server-filesystem, and server-memory
  • reproducible official smoke script: ./scripts/test-mcp-official.ps1

Backend Maturity Table

Backend Status Notes
auto beta within v1.0.0 real OpenAPI HTTP plus optional MCP stdio when configured
openapi_http beta within v1.0.0 strongest and most tested real execution path
mcp_stdio alpha real stdio subset, not full MCP transport coverage
mock test-only boundary explicit fallback for tests and dry-run style workflows

Maturity Matrix

Area Status Notes
IR v1.0.0 release deterministic, validated, directly tested
Compiler v1.0.0 release golden-backed and hot-path hardened
Gateway discovery/schema v1.0.0 release precomputed runtime views removed major alloc churn
OpenAPI HTTP execution beta within v1.0.0 real and useful for the tested subset
MCP execution alpha live stdio subset exists; not full transport/session coverage
OpenAI-compatible gateway execution alpha config exists, backend does not

More detail: docs/maturity.md

Honesty Boundary

Already part of the v1.0.0 release story:

  • precompile workflows
  • sidecar gateway deployments
  • internal API catalogs that already expose too much direct schema noise
  • OpenAPI-first teams that need a smaller discovery/invoke surface

Still important limits in v1.0.0:

  • no caller authn/authz gateway layer
  • no generic function execution backend
  • OpenAPI coverage is practical, not full-spec
  • some medium-catalog cases still need explicit choice between flat and staged

Intentionally out of scope for this repository state:

  • pretending that one benchmark run proves universal wins
  • claiming GA-grade backend coverage that does not exist
  • forcing a full platform rewrite to adopt toolc

Repository Map

  • cmd/toolc
  • internal/ir
  • internal/importers
  • internal/compiler
  • internal/gateway
  • internal/bench
  • docs/integration.md
  • docs/benchmarking.md
  • docs/architecture.md
  • docs/release-candidate.md

License: MIT.

About

A managed proxy gateway and compiler for AI agents. It takes noisy OpenAPI/MCP directories, compresses them in flat mode, saves tokens, and adds Governance.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors

Languages