feat(claude-cli): add local Claude Code CLI provider bridge by Spherrrical · Pull Request #936 · katanemo/plano

Spherrrical · 2026-05-04T20:16:48Z

Summary

Adds a claude-cli provider that runs the local claude Code CLI binary as a subprocess instead of calling Anthropic's API. The bridge is hosted inside brightstaff (gated on CLAUDE_CLI_LISTEN_ADDR) and presents an Anthropic Messages API on a loopback port; the rest of Plano's translation layer takes care of OpenAI Chat Completions / Responses clients.

User-facing surface is a single line:

model_providers:
  - model: claude-cli/*

The Python CLI auto-fills name, provider_interface: claude-cli, base_url: http://127.0.0.1:14001, and a placeholder access_key; the launcher (native + supervisord) only enables the bridge listener when at least one claude-cli provider is in the rendered config (zero-cost otherwise).

What's in the change

crates/hermesllm/src/apis/claude_cli.rs — serde types for the CLI's stream-json NDJSON, plus translators between MessagesRequest/MessagesResponse/MessagesStreamEvent and the CLI's stdin/stdout shapes. NDJSON fixture tests under crates/hermesllm/tests/fixtures/claude_cli/.
crates/hermesllm/src/providers/id.rs + src/clients/endpoints.rs — ProviderId::ClaudeCli, compatible_api_for_client always picking AnthropicMessagesAPI, and target_endpoint_for_provider keeping the upstream path at /v1/messages regardless of how the client framed the request.
crates/hermesllm/src/bin/provider_models.yaml — claude-cli/sonnet|opus|haiku family aliases and the dated full ids.
crates/brightstaff/src/handlers/claude_cli/{process,session,server,mod}.rs — child-process spawning (with env scrubbing and bypassPermissions), session reuse keyed by x-arch-claude-cli-session header or a deterministic hash, idle TTL + LRU cap, per-line watchdog, and a hyper listener that speaks POST /v1/messages (SSE + non-streaming). Wired into main.rs behind CLAUDE_CLI_LISTEN_ADDR. Integration test under crates/brightstaff/tests/claude_cli_bridge.rs using a fake_claude.sh fixture.
crates/common/src/configuration.rs — LlmProviderType::ClaudeCli enum variant.
cli/planoai/config_generator.py — auto-detects claude-cli/* model providers and fills implicit fields, with unit tests.
cli/planoai/native_runner.py + config/supervisord.conf — set CLAUDE_CLI_LISTEN_ADDR (and pass through CLAUDE_CLI_* overrides) only when the rendered config has a claude-cli provider.
config/plano_config_schema.yaml — adds claude-cli to the provider_interface enum.
demos/integrations/claude_cli/{config.yaml,README.md} — minimal one-line example.

Test plan

cargo test -p hermesllm --lib (translation + endpoint mapping)
cargo test -p hermesllm --test claude_cli_fixtures (NDJSON fixtures)
cargo test -p brightstaff --test claude_cli_bridge (fake CLI end-to-end)
cargo clippy --locked --all-targets --all-features -- -D warnings
cargo fmt --all -- --check
cd cli && uv run pytest -v (autofill + native_runner env injection)
Live smoke: planoai up demos/integrations/claude_cli/config.yaml
- POST /v1/messages (Anthropic, streaming + non-streaming) returns proper Anthropic SSE / JSON
- POST /v1/chat/completions with model: claude-cli/sonnet returns OpenAI Chat Completion shape (translation round-trip)
Reviewer to spot-check the model list in provider_models.yaml against the latest Anthropic Claude Code docs before release.

Spawn the local `claude` binary as a subprocess and expose it as an Anthropic Messages-compatible provider. Hosted in brightstaff (`CLAUDE_CLI_LISTEN_ADDR`), with session reuse, idle TTL, and watchdog. User-facing surface is `model_providers: [{ model: claude-cli/* }]` — the Python CLI auto-fills name/provider_interface/base_url/access_key and the launcher (native + supervisord) enables the bridge listener only when at least one claude-cli provider is present.

target_endpoint_for_provider was rewriting the upstream path to /v1/chat/completions for any provider that wasn't Anthropic/Vercel, which made Plano POST /v1/chat/completions to the brightstaff bridge. The bridge only accepts POST /v1/messages, so it returned a plain "not found" 404 to the client. Treat ClaudeCli the same as Anthropic for path selection (and force /v1/messages even when the client framed the request as OpenAI Chat Completions or Responses, since the bridge always speaks Anthropic Messages on the wire).

…-await - Convert ClaudeProcess::last_used from tokio::sync::Mutex<Instant> to std::sync::Mutex<Instant>: the critical section is one Copy read/write with no .await, so a sync mutex lets SessionManager iterate sessions without holding the map lock across an await point. Fixes the lock-across-await pattern in lru_session_id and evict_idle. - Simplify SessionManager::get_or_spawn to a single map-lock acquisition on the fast path; only release the lock for the rare case where we need to await a victim shutdown before spawning. - Replace the hand-rolled "deterministic UUID via DefaultHasher" with a real UUIDv5 over the OID namespace (uuid feature `v5`). Stable across Rust toolchain versions, unlike SipHash, and matches what the doc on the helper claimed all along. - Introduce ProcessError::MissingStdio { which } so spawns where Stdio::piped() somehow returned None surface as their own programmer- error variant rather than masquerading as ExitedEarly. - Delete the dead is_zero() helper.

- The synthetic message_start path only fired when the very first observed event was a Result. If the CLI ever emitted (say) a bare ContentBlockStart first, we'd ship malformed Anthropic SSE without a preceding message_start. Trigger the synthesis on any first stream-advancing event that isn't a MessageStart. - Make every send-to-client branch consistent: break out of the loop when the receiver has gone away (mpsc send returned Err), so we don't keep generating events for a vanished client. - Replace serde_json::to_string(...).unwrap() in the streaming error path with the same fallback json_response already uses ("{}" on serialize failure). No more panic surface in the streaming worker. - Drop the dead `_touch_stream_module` placeholder and its unused `use futures::stream` import.

- main.rs: rebuild claude_cli_config_from_env on top of SessionManagerConfig::default() and only override fields that have a parsed env var, so the defaults live in exactly one place. - hermesllm/apis/claude_cli.rs: delete the dead `_touch_messages_message_type` stub and its unused MessagesMessage import; apply pedantic-clippy fixes that touch the new code (clone_from over `= x.clone()`, Map::default() over Default::default(), map_or_else over .map(...).unwrap_or_else(...), str::to_string method reference, collapsed identical match arms). - hermesllm/providers/id.rs: collapse the two match arms that mapped "claude-cli" and "claude_cli" to ProviderId::ClaudeCli. - hermesllm/tests/claude_cli_fixtures.rs: collect text deltas straight into a String instead of `.collect::<Vec<_>>().join("")`. - brightstaff/tests/claude_cli_bridge.rs: add a Drop impl on BridgeFixture so a panicking test still releases the listener task.

`--no-session-persistence` only blocks resumability — Claude Code still writes `~/.claude/projects/<workspace>/<id>.jsonl` for every session. Reusing our deterministic brightstaff session id (a v5 UUID hashed from the conversation prefix) caused the CLI to fail every second request for the same conversation with `Error: Session ID ... is already in use`. Generate a per-spawn random v4 UUID inside `ClaudeProcess::spawn` and pass that to `claude --session-id` (and stamp it on every stdin JSONL event so the CLI accepts the turn). Keep the deterministic brightstaff session id as the `SessionManager` map key so retries still hit the hot child.

Trivy security-scan flagged uv 0.11.7 (currently fetched by an unpinned `pip install uv`) because it bundles rustls-webpki 0.103.10. The advisory (DoS via panic on malformed CRL BIT STRING) is fixed in 0.103.13. uv 0.11.11 picks up the fixed rustls-webpki, so we pin to that floor.

Drops the bullet-list capability dump, the relative-path "or in this repo" line, and the verbose dismissal block (which leaked the ack file path into user-visible output). The panel is now ~6 lines: title with interface(s), one sentence summary, "Learn more" pointing at docs.planoai.dev, and a one-line `--ack-local-agents` hint. The full trust-model write-up and the `rm` instruction live in the docs page. Also tightens the acknowledged-already and ack-success lines (no path leak) and switches the parenthetical name list to skip autofilled `<interface>/...` model strings.

Spherrrical added 11 commits May 4, 2026 12:57

chore(claude-cli): tweak demo config (full tracing, drop default flag)

56006f0

cli: warn + ack local-agent providers at planoai up

8e65fca

docs: cover claude-cli trust model and dismissal

294af49

Spherrrical marked this pull request as ready for review May 11, 2026 20:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(claude-cli): add local Claude Code CLI provider bridge#936

feat(claude-cli): add local Claude Code CLI provider bridge#936
Spherrrical wants to merge 11 commits into
mainfrom
musa/claude-cli-provider

Spherrrical commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Spherrrical commented May 4, 2026

Summary

What's in the change

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant