feat(claude-cli): add local Claude Code CLI provider bridge#936
Open
Spherrrical wants to merge 11 commits into
Open
feat(claude-cli): add local Claude Code CLI provider bridge#936Spherrrical wants to merge 11 commits into
Spherrrical wants to merge 11 commits into
Conversation
Spawn the local `claude` binary as a subprocess and expose it as an
Anthropic Messages-compatible provider. Hosted in brightstaff
(`CLAUDE_CLI_LISTEN_ADDR`), with session reuse, idle TTL, and watchdog.
User-facing surface is `model_providers: [{ model: claude-cli/* }]` —
the Python CLI auto-fills name/provider_interface/base_url/access_key
and the launcher (native + supervisord) enables the bridge listener
only when at least one claude-cli provider is present.
target_endpoint_for_provider was rewriting the upstream path to /v1/chat/completions for any provider that wasn't Anthropic/Vercel, which made Plano POST /v1/chat/completions to the brightstaff bridge. The bridge only accepts POST /v1/messages, so it returned a plain "not found" 404 to the client. Treat ClaudeCli the same as Anthropic for path selection (and force /v1/messages even when the client framed the request as OpenAI Chat Completions or Responses, since the bridge always speaks Anthropic Messages on the wire).
…-await
- Convert ClaudeProcess::last_used from tokio::sync::Mutex<Instant> to
std::sync::Mutex<Instant>: the critical section is one Copy read/write
with no .await, so a sync mutex lets SessionManager iterate sessions
without holding the map lock across an await point. Fixes the
lock-across-await pattern in lru_session_id and evict_idle.
- Simplify SessionManager::get_or_spawn to a single map-lock acquisition
on the fast path; only release the lock for the rare case where we
need to await a victim shutdown before spawning.
- Replace the hand-rolled "deterministic UUID via DefaultHasher" with a
real UUIDv5 over the OID namespace (uuid feature `v5`). Stable across
Rust toolchain versions, unlike SipHash, and matches what the doc on
the helper claimed all along.
- Introduce ProcessError::MissingStdio { which } so spawns where
Stdio::piped() somehow returned None surface as their own programmer-
error variant rather than masquerading as ExitedEarly.
- Delete the dead is_zero() helper.
- The synthetic message_start path only fired when the very first
observed event was a Result. If the CLI ever emitted (say) a bare
ContentBlockStart first, we'd ship malformed Anthropic SSE without a
preceding message_start. Trigger the synthesis on any first
stream-advancing event that isn't a MessageStart.
- Make every send-to-client branch consistent: break out of the loop
when the receiver has gone away (mpsc send returned Err), so we don't
keep generating events for a vanished client.
- Replace serde_json::to_string(...).unwrap() in the streaming error
path with the same fallback json_response already uses ("{}" on
serialize failure). No more panic surface in the streaming worker.
- Drop the dead `_touch_stream_module` placeholder and its unused
`use futures::stream` import.
- main.rs: rebuild claude_cli_config_from_env on top of
SessionManagerConfig::default() and only override fields that have a
parsed env var, so the defaults live in exactly one place.
- hermesllm/apis/claude_cli.rs: delete the dead
`_touch_messages_message_type` stub and its unused MessagesMessage
import; apply pedantic-clippy fixes that touch the new code
(clone_from over `= x.clone()`, Map::default() over Default::default(),
map_or_else over .map(...).unwrap_or_else(...), str::to_string method
reference, collapsed identical match arms).
- hermesllm/providers/id.rs: collapse the two match arms that mapped
"claude-cli" and "claude_cli" to ProviderId::ClaudeCli.
- hermesllm/tests/claude_cli_fixtures.rs: collect text deltas straight
into a String instead of `.collect::<Vec<_>>().join("")`.
- brightstaff/tests/claude_cli_bridge.rs: add a Drop impl on
BridgeFixture so a panicking test still releases the listener task.
`--no-session-persistence` only blocks resumability — Claude Code still writes `~/.claude/projects/<workspace>/<id>.jsonl` for every session. Reusing our deterministic brightstaff session id (a v5 UUID hashed from the conversation prefix) caused the CLI to fail every second request for the same conversation with `Error: Session ID ... is already in use`. Generate a per-spawn random v4 UUID inside `ClaudeProcess::spawn` and pass that to `claude --session-id` (and stamp it on every stdin JSONL event so the CLI accepts the turn). Keep the deterministic brightstaff session id as the `SessionManager` map key so retries still hit the hot child.
Trivy security-scan flagged uv 0.11.7 (currently fetched by an unpinned `pip install uv`) because it bundles rustls-webpki 0.103.10. The advisory (DoS via panic on malformed CRL BIT STRING) is fixed in 0.103.13. uv 0.11.11 picks up the fixed rustls-webpki, so we pin to that floor.
Drops the bullet-list capability dump, the relative-path "or in this repo" line, and the verbose dismissal block (which leaked the ack file path into user-visible output). The panel is now ~6 lines: title with interface(s), one sentence summary, "Learn more" pointing at docs.planoai.dev, and a one-line `--ack-local-agents` hint. The full trust-model write-up and the `rm` instruction live in the docs page. Also tightens the acknowledged-already and ack-success lines (no path leak) and switches the parenthetical name list to skip autofilled `<interface>/...` model strings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
claude-cliprovider that runs the localclaudeCode CLI binary as a subprocess instead of calling Anthropic's API. The bridge is hosted insidebrightstaff(gated onCLAUDE_CLI_LISTEN_ADDR) and presents an Anthropic Messages API on a loopback port; the rest of Plano's translation layer takes care of OpenAI Chat Completions / Responses clients.User-facing surface is a single line:
The Python CLI auto-fills
name,provider_interface: claude-cli,base_url: http://127.0.0.1:14001, and a placeholderaccess_key; the launcher (native + supervisord) only enables the bridge listener when at least oneclaude-cliprovider is in the rendered config (zero-cost otherwise).What's in the change
crates/hermesllm/src/apis/claude_cli.rs— serde types for the CLI'sstream-jsonNDJSON, plus translators betweenMessagesRequest/MessagesResponse/MessagesStreamEventand the CLI's stdin/stdout shapes. NDJSON fixture tests undercrates/hermesllm/tests/fixtures/claude_cli/.crates/hermesllm/src/providers/id.rs+src/clients/endpoints.rs—ProviderId::ClaudeCli,compatible_api_for_clientalways pickingAnthropicMessagesAPI, andtarget_endpoint_for_providerkeeping the upstream path at/v1/messagesregardless of how the client framed the request.crates/hermesllm/src/bin/provider_models.yaml—claude-cli/sonnet|opus|haikufamily aliases and the dated full ids.crates/brightstaff/src/handlers/claude_cli/{process,session,server,mod}.rs— child-process spawning (with env scrubbing andbypassPermissions), session reuse keyed byx-arch-claude-cli-sessionheader or a deterministic hash, idle TTL + LRU cap, per-line watchdog, and a hyper listener that speaksPOST /v1/messages(SSE + non-streaming). Wired intomain.rsbehindCLAUDE_CLI_LISTEN_ADDR. Integration test undercrates/brightstaff/tests/claude_cli_bridge.rsusing afake_claude.shfixture.crates/common/src/configuration.rs—LlmProviderType::ClaudeClienum variant.cli/planoai/config_generator.py— auto-detectsclaude-cli/*model providers and fills implicit fields, with unit tests.cli/planoai/native_runner.py+config/supervisord.conf— setCLAUDE_CLI_LISTEN_ADDR(and pass throughCLAUDE_CLI_*overrides) only when the rendered config has aclaude-cliprovider.config/plano_config_schema.yaml— addsclaude-clito theprovider_interfaceenum.demos/integrations/claude_cli/{config.yaml,README.md}— minimal one-line example.Test plan
cargo test -p hermesllm --lib(translation + endpoint mapping)cargo test -p hermesllm --test claude_cli_fixtures(NDJSON fixtures)cargo test -p brightstaff --test claude_cli_bridge(fake CLI end-to-end)cargo clippy --locked --all-targets --all-features -- -D warningscargo fmt --all -- --checkcd cli && uv run pytest -v(autofill + native_runner env injection)planoai up demos/integrations/claude_cli/config.yamlPOST /v1/messages(Anthropic, streaming + non-streaming) returns proper Anthropic SSE / JSONPOST /v1/chat/completionswithmodel: claude-cli/sonnetreturns OpenAI Chat Completion shape (translation round-trip)provider_models.yamlagainst the latest Anthropic Claude Code docs before release.