Skip to content

feat(claude-cli): add local Claude Code CLI provider bridge#936

Open
Spherrrical wants to merge 11 commits into
mainfrom
musa/claude-cli-provider
Open

feat(claude-cli): add local Claude Code CLI provider bridge#936
Spherrrical wants to merge 11 commits into
mainfrom
musa/claude-cli-provider

Conversation

@Spherrrical
Copy link
Copy Markdown
Collaborator

Summary

Adds a claude-cli provider that runs the local claude Code CLI binary as a subprocess instead of calling Anthropic's API. The bridge is hosted inside brightstaff (gated on CLAUDE_CLI_LISTEN_ADDR) and presents an Anthropic Messages API on a loopback port; the rest of Plano's translation layer takes care of OpenAI Chat Completions / Responses clients.

User-facing surface is a single line:

model_providers:
  - model: claude-cli/*

The Python CLI auto-fills name, provider_interface: claude-cli, base_url: http://127.0.0.1:14001, and a placeholder access_key; the launcher (native + supervisord) only enables the bridge listener when at least one claude-cli provider is in the rendered config (zero-cost otherwise).

What's in the change

  • crates/hermesllm/src/apis/claude_cli.rs — serde types for the CLI's stream-json NDJSON, plus translators between MessagesRequest/MessagesResponse/MessagesStreamEvent and the CLI's stdin/stdout shapes. NDJSON fixture tests under crates/hermesllm/tests/fixtures/claude_cli/.
  • crates/hermesllm/src/providers/id.rs + src/clients/endpoints.rsProviderId::ClaudeCli, compatible_api_for_client always picking AnthropicMessagesAPI, and target_endpoint_for_provider keeping the upstream path at /v1/messages regardless of how the client framed the request.
  • crates/hermesllm/src/bin/provider_models.yamlclaude-cli/sonnet|opus|haiku family aliases and the dated full ids.
  • crates/brightstaff/src/handlers/claude_cli/{process,session,server,mod}.rs — child-process spawning (with env scrubbing and bypassPermissions), session reuse keyed by x-arch-claude-cli-session header or a deterministic hash, idle TTL + LRU cap, per-line watchdog, and a hyper listener that speaks POST /v1/messages (SSE + non-streaming). Wired into main.rs behind CLAUDE_CLI_LISTEN_ADDR. Integration test under crates/brightstaff/tests/claude_cli_bridge.rs using a fake_claude.sh fixture.
  • crates/common/src/configuration.rsLlmProviderType::ClaudeCli enum variant.
  • cli/planoai/config_generator.py — auto-detects claude-cli/* model providers and fills implicit fields, with unit tests.
  • cli/planoai/native_runner.py + config/supervisord.conf — set CLAUDE_CLI_LISTEN_ADDR (and pass through CLAUDE_CLI_* overrides) only when the rendered config has a claude-cli provider.
  • config/plano_config_schema.yaml — adds claude-cli to the provider_interface enum.
  • demos/integrations/claude_cli/{config.yaml,README.md} — minimal one-line example.

Test plan

  • cargo test -p hermesllm --lib (translation + endpoint mapping)
  • cargo test -p hermesllm --test claude_cli_fixtures (NDJSON fixtures)
  • cargo test -p brightstaff --test claude_cli_bridge (fake CLI end-to-end)
  • cargo clippy --locked --all-targets --all-features -- -D warnings
  • cargo fmt --all -- --check
  • cd cli && uv run pytest -v (autofill + native_runner env injection)
  • Live smoke: planoai up demos/integrations/claude_cli/config.yaml
    • POST /v1/messages (Anthropic, streaming + non-streaming) returns proper Anthropic SSE / JSON
    • POST /v1/chat/completions with model: claude-cli/sonnet returns OpenAI Chat Completion shape (translation round-trip)
  • Reviewer to spot-check the model list in provider_models.yaml against the latest Anthropic Claude Code docs before release.

Spherrrical added 11 commits May 4, 2026 12:57
Spawn the local `claude` binary as a subprocess and expose it as an
Anthropic Messages-compatible provider. Hosted in brightstaff
(`CLAUDE_CLI_LISTEN_ADDR`), with session reuse, idle TTL, and watchdog.

User-facing surface is `model_providers: [{ model: claude-cli/* }]` —
the Python CLI auto-fills name/provider_interface/base_url/access_key
and the launcher (native + supervisord) enables the bridge listener
only when at least one claude-cli provider is present.
target_endpoint_for_provider was rewriting the upstream path to
/v1/chat/completions for any provider that wasn't Anthropic/Vercel,
which made Plano POST /v1/chat/completions to the brightstaff bridge.
The bridge only accepts POST /v1/messages, so it returned a plain
"not found" 404 to the client.

Treat ClaudeCli the same as Anthropic for path selection (and force
/v1/messages even when the client framed the request as OpenAI Chat
Completions or Responses, since the bridge always speaks Anthropic
Messages on the wire).
…-await

- Convert ClaudeProcess::last_used from tokio::sync::Mutex<Instant> to
  std::sync::Mutex<Instant>: the critical section is one Copy read/write
  with no .await, so a sync mutex lets SessionManager iterate sessions
  without holding the map lock across an await point. Fixes the
  lock-across-await pattern in lru_session_id and evict_idle.
- Simplify SessionManager::get_or_spawn to a single map-lock acquisition
  on the fast path; only release the lock for the rare case where we
  need to await a victim shutdown before spawning.
- Replace the hand-rolled "deterministic UUID via DefaultHasher" with a
  real UUIDv5 over the OID namespace (uuid feature `v5`). Stable across
  Rust toolchain versions, unlike SipHash, and matches what the doc on
  the helper claimed all along.
- Introduce ProcessError::MissingStdio { which } so spawns where
  Stdio::piped() somehow returned None surface as their own programmer-
  error variant rather than masquerading as ExitedEarly.
- Delete the dead is_zero() helper.
- The synthetic message_start path only fired when the very first
  observed event was a Result. If the CLI ever emitted (say) a bare
  ContentBlockStart first, we'd ship malformed Anthropic SSE without a
  preceding message_start. Trigger the synthesis on any first
  stream-advancing event that isn't a MessageStart.
- Make every send-to-client branch consistent: break out of the loop
  when the receiver has gone away (mpsc send returned Err), so we don't
  keep generating events for a vanished client.
- Replace serde_json::to_string(...).unwrap() in the streaming error
  path with the same fallback json_response already uses ("{}" on
  serialize failure). No more panic surface in the streaming worker.
- Drop the dead `_touch_stream_module` placeholder and its unused
  `use futures::stream` import.
- main.rs: rebuild claude_cli_config_from_env on top of
  SessionManagerConfig::default() and only override fields that have a
  parsed env var, so the defaults live in exactly one place.
- hermesllm/apis/claude_cli.rs: delete the dead
  `_touch_messages_message_type` stub and its unused MessagesMessage
  import; apply pedantic-clippy fixes that touch the new code
  (clone_from over `= x.clone()`, Map::default() over Default::default(),
  map_or_else over .map(...).unwrap_or_else(...), str::to_string method
  reference, collapsed identical match arms).
- hermesllm/providers/id.rs: collapse the two match arms that mapped
  "claude-cli" and "claude_cli" to ProviderId::ClaudeCli.
- hermesllm/tests/claude_cli_fixtures.rs: collect text deltas straight
  into a String instead of `.collect::<Vec<_>>().join("")`.
- brightstaff/tests/claude_cli_bridge.rs: add a Drop impl on
  BridgeFixture so a panicking test still releases the listener task.
`--no-session-persistence` only blocks resumability — Claude Code
still writes `~/.claude/projects/<workspace>/<id>.jsonl` for every
session. Reusing our deterministic brightstaff session id (a v5 UUID
hashed from the conversation prefix) caused the CLI to fail every
second request for the same conversation with
`Error: Session ID ... is already in use`.

Generate a per-spawn random v4 UUID inside `ClaudeProcess::spawn` and
pass that to `claude --session-id` (and stamp it on every stdin
JSONL event so the CLI accepts the turn). Keep the deterministic
brightstaff session id as the `SessionManager` map key so retries
still hit the hot child.
Trivy security-scan flagged uv 0.11.7 (currently fetched by an unpinned
`pip install uv`) because it bundles rustls-webpki 0.103.10. The advisory
(DoS via panic on malformed CRL BIT STRING) is fixed in 0.103.13.
uv 0.11.11 picks up the fixed rustls-webpki, so we pin to that floor.
Drops the bullet-list capability dump, the relative-path "or in this
repo" line, and the verbose dismissal block (which leaked the ack file
path into user-visible output). The panel is now ~6 lines: title with
interface(s), one sentence summary, "Learn more" pointing at
docs.planoai.dev, and a one-line `--ack-local-agents` hint. The full
trust-model write-up and the `rm` instruction live in the docs page.

Also tightens the acknowledged-already and ack-success lines (no path
leak) and switches the parenthetical name list to skip autofilled
`<interface>/...` model strings.
@Spherrrical Spherrrical marked this pull request as ready for review May 11, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant