English | Русский
Compile noisy OpenAPI and MCP tool surfaces into one smaller governed runtime surface for LLM agents.
Validated today with real OpenRouter model runs, real saved benchmark artifacts, and a live OpenAPI invoke check.
Best fit today:
- VS Code users with too many MCP servers
- OpenAPI or MCP-heavy agent stacks
- teams that want less exposed tool surface without a platform rewrite
Two supported paths for v1.0.0:
- download a prebuilt archive from GitHub Releases
- install from source with Go
go install github.com/aak204/Tool-Catalog-Compiler/cmd/toolc@v1.0.0After the v1.0.0 tag is published as the latest tag, this also works:
go install github.com/aak204/Tool-Catalog-Compiler/cmd/toolc@latestRelease assets for v1.0.0:
toolc_v1.0.0_windows_amd64.ziptoolc_v1.0.0_linux_amd64.tar.gztoolc_v1.0.0_linux_arm64.tar.gztoolc_v1.0.0_darwin_amd64.tar.gztoolc_v1.0.0_darwin_arm64.tar.gzSHA256SUMS.txt
If you already use MCP in VS Code, start here.
toolc can take the MCP config you already use, compile it into one smaller controlled surface, and become the single MCP server your agent sees.
Start here:
- build
toolc - optimize your existing MCP config
- point your extension at the emitted config
- run one
toolc mcp-servesurface instead of many raw servers
Build once:
PowerShell
./scripts/build.ps1bash
./scripts/build.shCommon input config paths:
- VS Code / Codex-style MCP config:
.vscode/mcp.json - Roo Code project config:
.roo/mcp.json - Roo Code global config:
mcp_settings.json - Cline config:
cline_mcp_settings.json - Kilo config:
kilo.jsoncor.kilocode/mcp.json
Optimize an existing MCP config:
PowerShell
./dist/toolc.exe optimize-mcp `
-input .vscode/mcp.json `
-emit vscode `
-out-dir dist/mcp-optimizedbash
./dist/toolc optimize-mcp \
-input .vscode/mcp.json \
-emit vscode \
-out-dir dist/mcp-optimizedWhat you get:
dist/mcp-optimized/toolc.compiled.jsondist/mcp-optimized/optimized.jsondist/mcp-optimized/optimization-report.json
Use it:
- point your extension at
dist/mcp-optimized/optimized.json - the emitted config starts
toolc mcp-servefor you toolcbecomes the one MCP surface the extension sees
Run the MCP surface directly:
PowerShell
./dist/toolc.exe mcp-serve -catalog dist/mcp-optimized/toolc.compiled.jsonbash
./dist/toolc mcp-serve -catalog dist/mcp-optimized/toolc.compiled.jsonPlatform status:
- release assets are published for Windows, Linux, and macOS
- primary exercised path today: Windows + PowerShell
- bash equivalents are provided for core CLI flows
- Windows and Linux CLI workflows are documented
- macOS CLI flows are included in release assets, but are less exercised than the Windows path
Current honest boundary:
- strongest path today: local MCP stdio servers
- remote MCP transports are preserved when they cannot yet be honestly absorbed
The default product story is simple:
flatis the default runtime pathstagedis the controlled escalation pathdirectstays a baseline/debug/control mode- the wedge is a smaller, governed runtime surface over existing tools, not a platform rewrite
Not the product claim:
- a full MCP platform
- universal backend coverage
- GA-grade authn/authz or full-spec OpenAPI execution
- a benchmark suite pretending to be the product
- imports OpenAPI/function/MCP-like inputs into one IR
- compiles them into a smaller, more stable catalog
- applies policy/governance before exposure
- serves that catalog either as a lightweight gateway or as one MCP-compatible surface
The strongest current wedge is MCP optimization for editor users:
- import the MCP config you already have
- probe what can be safely absorbed
- compile and stabilize the surface
- emit one
toolcMCP server config back to the client
Unsupported remote transports are preserved honestly instead of being falsely claimed as absorbed.
flowchart LR
A[OpenAPI / Function / MCP-like input] --> B[Importers]
B --> C[IR]
C --> D[Compiler passes]
D --> E[Compiled catalog]
E --> F[Mode selection]
F --> G1[flat]
F --> G2[staged]
F --> G3[precompiled artifact]
E --> H[Gateway]
H --> I[compact discovery]
H --> J[schema lookup]
H --> K[real OpenAPI HTTP invoke]
H --> L[live MCP stdio invoke]
toolc now treats runtime shape selection as an explicit choice:
| Mode | What it does | Choose it when | Do not choose it when | Latency | Token surface | Correctness | Integration complexity |
|---|---|---|---|---|---|---|---|
direct |
expose the source surface directly | baseline/debug, tiny clean catalogs, or benchmark control cases | you want a product default for noisy real-world tool surfaces | lowest overhead | worst | weakest | lowest |
flat |
one flattened compiled surface or compact discovery wrapper | primary runtime path for most real deployments | you already know one-shot argument generation is historically weak for the chosen model/provider pair | best default | much smaller than direct | better than direct | moderate |
staged |
compiled discovery plus schema lookup fallback | controlled escalation when schema complexity, malformed arguments, or risk level justify a second pass | you are looking for the default path on every request | highest orchestration cost | best dynamic shaping | strongest | moderate |
Default recommendation:
- default runtime path:
flat - staged escalation: use
stagedonly when schema/risk/reliability signals justify it - direct exposure: keep
directfor baseline/debug/control, not as the product narrative
Benchmark comparisons treat these as isolated mode semantics. The product workflow can still use flat first and escalate deliberately when needed.
You can ask the CLI directly:
./dist/toolc.exe recommend -catalog testdata/golden/openapi.compiled.jsonPowerShell
./dist/toolc.exe optimize-mcp `
-input .vscode/mcp.json `
-emit vscode `
-out-dir dist/mcp-optimizedbash
./dist/toolc optimize-mcp \
-input .vscode/mcp.json \
-emit vscode \
-out-dir dist/mcp-optimizedWhat this does:
- parses the existing client config
- probes supported local stdio servers
- compiles one aggregated catalog
- emits a replacement config pointing to
toolc mcp-serve - preserves unsupported remote MCP servers when they cannot be honestly absorbed yet
Then point your client at the emitted config:
PowerShell
./dist/toolc.exe mcp-serve -catalog dist/mcp-optimized/toolc.compiled.jsonbash
./dist/toolc mcp-serve -catalog dist/mcp-optimized/toolc.compiled.jsonWhat you change:
- add one compile step
What you do not change:
- your runtime path
PowerShell
./scripts/build.ps1
./dist/toolc.exe compile `
-importer openapi `
-source testdata/fixtures/openapi.todo.yaml `
-out dist/openapi.compiled.jsonbash
./scripts/build.sh
./dist/toolc compile \
-importer openapi \
-source testdata/fixtures/openapi.todo.yaml \
-out dist/openapi.compiled.jsonWhat you change:
- point callers at the gateway instead of the direct catalog
What you do not change:
- your upstream OpenAPI service
PowerShell
./dist/toolc.exe gateway `
-config testdata/toolc.config.yaml `
-catalog dist/openapi.compiled.jsonbash
./dist/toolc gateway \
-config testdata/toolc.config.yaml \
-catalog dist/openapi.compiled.jsonWhat you change:
- compile the catalog in CI and publish the artifact
What you do not change:
- your service implementation
PowerShell
./scripts/check.ps1
./dist/toolc.exe compile `
-config testdata/toolc.config.yaml `
-importer openapi `
-source testdata/fixtures/openapi.todo.yaml `
-out dist/openapi.compiled.jsonbash
./scripts/check.sh
./dist/toolc compile \
-config testdata/toolc.config.yaml \
-importer openapi \
-source testdata/fixtures/openapi.todo.yaml \
-out dist/openapi.compiled.jsonWhat you change:
- run
toolcin front of a real OpenAPI surface
What you do not change:
- the upstream API itself
PowerShell
$env:TOOLC_GATEWAY_BACKEND="auto"
$env:TOOLC_GATEWAY_BEARER_TOKEN="***"
./dist/toolc.exe compile `
-importer openapi `
-source testdata/fixtures/openapi.openmeteo.yaml `
-out dist/openmeteo.compiled.json
./dist/toolc.exe gateway `
-config testdata/toolc.config.yaml `
-catalog dist/openmeteo.compiled.jsonbash
export TOOLC_GATEWAY_BACKEND=auto
export TOOLC_GATEWAY_BEARER_TOKEN='***'
./dist/toolc compile \
-importer openapi \
-source testdata/fixtures/openapi.openmeteo.yaml \
-out dist/openmeteo.compiled.json
./dist/toolc gateway \
-config testdata/toolc.config.yaml \
-catalog dist/openmeteo.compiled.jsonMore detail: docs/integration.md
What you change:
- point
toolcat an MCP stdio server command
What you do not change:
- your MCP tool manifest shape
- the MCP server implementation itself
gateway:
backend: mcp_stdio
mcp:
command: node
args:
- path\\to\\your-mcp-server.jsPowerShell
./dist/toolc.exe gateway `
-config testdata/toolc.config.yaml `
-catalog dist/mcp.compiled.json `
-backend mcp_stdiobash
./dist/toolc gateway \
-config testdata/toolc.config.yaml \
-catalog dist/mcp.compiled.json \
-backend mcp_stdioProbe a live MCP server before rollout:
PowerShell
./dist/toolc.exe doctor -config dist/mcp-filesystem.config.yaml -probe-backendsbash
./dist/toolc doctor -config dist/mcp-filesystem.config.yaml -probe-backendsBenchmarks are here to validate the product thesis, not replace it.
Current saved artifacts are real Windows/amd64 runs from April 13, 2026 using OpenRouter for model-evaluated scenarios plus a live Open-Meteo invoke check for runtime execution.
| Scenario | Useful signal | Honest takeaway |
|---|---|---|
synthetic-10 |
flat and staged both cut token surface vs direct |
even a small noisy catalog benefits from a controlled surface |
synthetic-50 |
flat usually stays much cheaper than direct, while staged pays extra orchestration cost |
medium catalogs are where flat-first vs staged escalation becomes a real tradeoff |
github-collaboration-subset |
flat cuts surface hard and usually improves cost materially |
the main win is smaller controlled exposure, not universal latency wins on every model |
stripe-rest-full-metrics |
importer/compile now survive the full official Stripe spec instead of blowing the process up | large real specs are now bounded by schema expansion guards |
openmeteo-runtime-public |
real external invoke completed with success_rate = 1 and about 302 ms average latency in the latest run |
real execution is narrow, but it is part of the validated story |
Primary benchmark takeaway:
flatis the best default product pathstagedis useful, but should be treated as targeted escalation- provider/model behavior still matters enough that benchmark artifacts must stay honest about where wins are not universal
GLM-specific note:
z-ai/glm-5.1currently runs with a model-specific benchmark profile, not the global default- current tuned profile: selection steps use a larger completion budget than the global baseline, and argument generation uses a separate lower cap
- controlled experiment artifacts live in
dist/glm-controlled/summary.md
Token proxy by variant:
Projected cost by variant:
Median latency by variant:
P95 latency by variant:
Error count by variant:
Success rate by variant:
Compiled-mode break-even:
Small vs large catalog behavior:
Importer and compiler allocations:
Regenerate charts from saved artifacts:
./scripts/generate-charts.ps1 `
-Inputs bench/results/benchmark-results.json,dist/synthetic-50-results/benchmark-results.json `
-Output docs/assetsMore detail: docs/benchmarking.md
toolc [-quiet] [-verbose] [-log-format text|json] <command> [flags]
Primary commands:
toolc optimize-mcptoolc mcp-servetoolc compiletoolc gatewaytoolc doctortoolc recommendtoolc benchtoolc charts
Examples:
./dist/toolc.exe doctor -config testdata/toolc.config.yaml -catalog dist/openapi.compiled.json
./dist/toolc.exe doctor -config dist/mcp-filesystem.config.yaml -probe-backends
./dist/toolc.exe recommend -catalog dist/openapi.compiled.json
./dist/toolc.exe charts -input bench/results/benchmark-results.json -input dist/synthetic-50-results/benchmark-results.json -out docs/assets
./dist/toolc.exe bench -config testdata/toolc.config.yaml -suite bench/suites/release.yaml -resume -model-workers 4 -model-rps 2Real execution is currently narrow by design:
- real: compiled OpenAPI HTTP tools
- real: MCP tools through the stdio subset backend
- mock: explicit testing/dry-run backend
- not implemented: generic function execution, OpenAI-compatible gateway execution
Current OpenAPI HTTP backend capabilities:
- shared transport reuse
- configurable invoke timeout
- bounded retry/backoff for safe transient failures
- bearer auth forwarding when imported metadata requires it
- typed upstream error envelopes with retryability hints
Current MCP stdio backend capabilities:
- stdio transport
initializeplusnotifications/initializedtools/call- persistent subprocess session reuse
- real subprocess integration tests
- clear alpha maturity boundary
- external validation against official
server-everything,server-filesystem, andserver-memory - reproducible official smoke script:
./scripts/test-mcp-official.ps1
| Backend | Status | Notes |
|---|---|---|
auto |
beta within v1.0.0 |
real OpenAPI HTTP plus optional MCP stdio when configured |
openapi_http |
beta within v1.0.0 |
strongest and most tested real execution path |
mcp_stdio |
alpha | real stdio subset, not full MCP transport coverage |
mock |
test-only boundary | explicit fallback for tests and dry-run style workflows |
| Area | Status | Notes |
|---|---|---|
| IR | v1.0.0 release |
deterministic, validated, directly tested |
| Compiler | v1.0.0 release |
golden-backed and hot-path hardened |
| Gateway discovery/schema | v1.0.0 release |
precomputed runtime views removed major alloc churn |
| OpenAPI HTTP execution | beta within v1.0.0 |
real and useful for the tested subset |
| MCP execution | alpha | live stdio subset exists; not full transport/session coverage |
| OpenAI-compatible gateway execution | alpha | config exists, backend does not |
More detail: docs/maturity.md
Already part of the v1.0.0 release story:
- precompile workflows
- sidecar gateway deployments
- internal API catalogs that already expose too much direct schema noise
- OpenAPI-first teams that need a smaller discovery/invoke surface
Still important limits in v1.0.0:
- no caller authn/authz gateway layer
- no generic function execution backend
- OpenAPI coverage is practical, not full-spec
- some medium-catalog cases still need explicit choice between
flatandstaged
Intentionally out of scope for this repository state:
- pretending that one benchmark run proves universal wins
- claiming GA-grade backend coverage that does not exist
- forcing a full platform rewrite to adopt
toolc
cmd/toolcinternal/irinternal/importersinternal/compilerinternal/gatewayinternal/benchdocs/integration.mddocs/benchmarking.mddocs/architecture.mddocs/release-candidate.md
License: MIT.