OpenWar exposes its runtime as an OpenAI Chat Completions HTTP server. Any tool that speaks OpenAI's API (Aider, Continue, Cline, Cursor's CLI mode, the OpenAI SDKs themselves, the dozens of OpenAI-API homegrown wrappers) can point at the server and consume OpenWar's discipline layer with zero changes on its end. The tool thinks it is talking to OpenAI; OpenWar applies its phase machine, trace, and detector pipeline underneath, then routes the actual completion to whatever upstream adapter is configured (Anthropic, OpenAI, Gemini, Grok, openai-compat local model, or a cli-bridge spawn).
This is the MVP cut. It ships:
POST /v1/chat/completions. both streaming (SSE) and non-streaming JSON.GET /v1/models. declares one model entry for the configured upstream.GET /healthz. liveness probe; no auth required.- 404 fallback in OpenAI error shape on every other path.
- Bearer-token auth with constant-time compare, plus
--no-authfor local dev. - Localhost-default bind (
127.0.0.1); binding to0.0.0.0requires explicit intent and warns. - Per-request concurrency cap (
--max-concurrent, default 4) returning OpenAIrate_limit_error429 on excess. - Per-request synthesized brief at
~/.openwar/sessions/proxy-<uuid>.trace.ndjsonso the operator can audit what the foreign client did viaopenwar inspect proxy-<uuid>. X-OpenWar-Trace-Idresponse header on every response for trace correlation.
It does NOT yet ship:
- Tool-call translation (request
toolsarray, responsetool_calls). v0.13.0 acknowledges thetoolsfield at parse time and records the count inproxy_request, but does not yet round-trip tool calls. Plain-text Aider / Continue / Cline sessions work end-to-end; agentic tool-use does not. - PermissionBridge negotiation via
openwar:request_permissiontool_calls. The encoding helpers exist; the routing lands when tools light up. - cli-bridge composition is structurally supported (cli-bridge is a normal upstream adapter) but agentic capability is gated on the tool surface above.
Both deferred items land in v0.13.1.
The proxy is designed to be safe by default on a developer laptop:
- Localhost-only bind.
--bind 0.0.0.0requires explicit intent and emits a startup warning. The operator running with the default cannot accidentally expose the proxy on a network. - Bearer-token auth required. Without
--auth-tokenAND without--no-auth, the server refuses to start.--no-authworks but warns every startup. The constant-time compare insrc/serve/auth.tsmeans token-guessing attacks cannot be timed against the localhost socket. - Conservative
authorized_costsdefault. Synthesized briefs getfilesystem_readonly by default. Operators expand explicitly via--authorized-costs filesystem_read,filesystem_write,shell_exec(or narrower) per their trust model. The proxy does NOT silently grant write or shell privileges to foreign clients. - Stateless across requests. Each proxied request is its own
Session. No cross-request memory unless the client sends prior messages in its next request (standard OpenAI conversation pattern). Persistent permission grants from~/.openwar/projects/<slug>/permission_grants.jsonlare NOT loaded for proxy sessions; proxy sessions are project-less by design. - No TLS in the proxy itself. Operators wanting HTTPS run a reverse proxy (nginx, Caddy, Cloudflare Tunnel) in front. The localhost-default keeps the default safe.
- Trace audit trail. Every request emits
proxy_requestat start,proxy_responseat end, plus any tool_call / detector / phase events that fire. Auditable viaopenwar inspect proxy-<uuid>.
openwar serve --openai-compat \
--auth-token "$(openssl rand -hex 16)" \
--authorized-costs filesystem_read,filesystem_write,shell_execThe server prints a startup banner with the curl command you can paste to test, plus the authorized_costs expansion hint reminding you that agentic clients typically need more than the conservative default.
Once running, any OpenAI-compat tool points at it:
export OPENAI_BASE_URL=http://127.0.0.1:1234/v1
export OPENAI_API_KEY=<the-token-from-above>
aider --model openwarThe --model openwar part is arbitrary; the proxy passes whatever model name the client sends through to the upstream adapter (substituting --upstream-model if the upstream needs a different name; substitution is recorded on the proxy_request trace event via model_substituted_from).
| Flag | Default | Notes |
|---|---|---|
--openai-compat |
(required) | v0.13.0 ships this one serve mode; the flag exists so future modes (raw MCP, native OpenWar API) can slot in alongside. |
--port <n> |
1234 |
LM Studio convention; friendly to existing OpenAI-compat client habits. |
--bind <host> |
127.0.0.1 |
0.0.0.0 warns at startup. |
--upstream-adapter <id> |
auto-detect | anthropic / openai / gemini / grok / openai-compat / cli-bridge. Auto-detection precedence matches openwar chat: ANTHROPIC_API_KEY > OPENAI_API_KEY > GEMINI_API_KEY (or GOOGLE_API_KEY) > XAI_API_KEY > OPENAI_COMPAT_API_KEY. |
--upstream-model <name> |
adapter default | When the client's requested model differs, the proxy substitutes this and records the substitution in proxy_request.model_substituted_from. |
--auth-token <token> |
(required unless --no-auth) |
Constant-time compared against the Authorization: Bearer <token> header. Server refuses to start without this OR --no-auth. |
--no-auth |
off | Opt-out for local development. Warns every startup. |
--workdir <path> |
process.cwd() |
Synthesized-brief sandbox root. |
--authorized-costs <list> |
filesystem_read |
Comma-separated. Operators expand explicitly. |
--max-concurrent <n> |
4 |
Excess returns OpenAI rate_limit_error 429 with code openwar_max_concurrent. |
--log-requests |
off | One line per request to stderr. |
--upstream-adapter cli-bridge is structurally supported and emits a startup warning:
WARNING: cli-bridge as upstream spawns one CLI child per request.
Each request adds 2-5s of cold-start latency. Concurrent
requests scale memory by ~400MB per Claude Code instance.
Consider --max-concurrent 1 for cli-bridge upstream.
This composition is the most powerful configuration the proxy supports. a foreign OpenAI-API tool routes through OpenWar, which dispatches to Claude Code (or Codex / Gemini CLI), which executes with the structured-event capture from v0.12.1 all the way through. The trace at ~/.openwar/sessions/proxy-<uuid>.trace.ndjson captures every bridged_tool_call / bridged_tool_result / bridged_thinking_delta / bridged_usage event from the inner CLI's run alongside the proxy bookkeeping events. The composition is real; the operator-experience caveat is real too. keep --max-concurrent low and expect cold-start latency on burst traffic.
Every proxied request produces a fresh trace file at ~/.openwar/sessions/proxy-<uuid>.trace.ndjson. Two new event types beyond the existing trace schema:
proxy_request. emitted at session start. Fields:request_id,client_addr,model,stream(boolean),tool_count,at, optionalmodel_substituted_from.proxy_response. emitted at session end. Fields:request_id,status_code,duration_ms,bytes_written,cancelled,at.
TRACE_SCHEMA_VERSION bumps from 4 to 5 for these additive variants. Old readers ignore unknown types.
Inspect a completed request:
openwar inspect proxy-<uuid> # session summary
openwar inspect proxy-<uuid> --trace # raw event streamThe X-OpenWar-Trace-Id response header on every proxy response carries the request id, so tooling that wants to audit a request can grab the id without parsing logs.
The brief deferred or never-scoped the following. Some land in v0.13.1, some are reserved indefinitely:
- Legacy
/v1/completionsendpoint. Chat Completions only. - Embeddings endpoint. Not an agentic surface.
- OpenAI Assistants API. Different protocol, different state model.
- Legacy
functionsfield. Moderntoolsfield only (when it lands in v0.13.1). - WebSocket / realtime API. Out of scope.
- Multi-tenant auth. Single bearer token, single operator. Multi-tenant is a War Room concern.
- TLS / HTTPS. Use a reverse proxy.
- Persistent permission grants in proxy sessions. Proxy sessions are project-less; ledger is in-memory per request.
- Rate limiting beyond
--max-concurrent. No per-token / per-IP / per-hour rate limiting. Operator-side concern. - Vision, custom
tool_choicevariants beyondauto/none/required,parallel_tool_callsbeyond default. Extensions land in patch releases if specific named clients need them. - Legacy
promptfield for ancient OpenAI clients. Document the requirement: clients must use modernmessages.
- No
toolsround-trip. The request'stoolsarray is recognized at parse time and surfaces inproxy_request.tool_count, but the upstream adapter is called WITHOUT the tool definitions. Plain-text completion only in v0.13.0. - No
usagereporting. v0.13.0 does not run a tokenizer or thread upstream usage data into the response. Theusagefield is omitted; OpenAI clients tolerate the absence. - No conversation history beyond what the client sends in
messages. Each request is a freshSession. finish_reasonin v0.13.0 is alwaysstopfor successful completions.content_filter/tool_calls/lengthare reserved for v0.13.1 + later.
These items are not bugs; they are scope. v0.13.1 closes the tool surface; v0.13.x patches address specific client compatibility issues as they surface.
# 1. Start the proxy with an Anthropic API key configured upstream:
export ANTHROPIC_API_KEY="sk-ant-..."
openwar serve --openai-compat \
--auth-token "my-local-token" \
--authorized-costs filesystem_read,filesystem_write \
--upstream-model claude-opus-4-7
# 2. In another shell, point Aider at the proxy:
export OPENAI_BASE_URL=http://127.0.0.1:1234/v1
export OPENAI_API_KEY=my-local-token
aider --model openwar src/foo.py
# 3. Aider sends Chat Completions requests; OpenWar routes them to
# Anthropic (or whatever upstream is configured), translates the
# response back to OpenAI shape, and records the whole thing in
# a trace file. Inspect:
openwar inspect $(curl -s -X POST http://127.0.0.1:1234/v1/chat/completions \
-H "Authorization: Bearer my-local-token" \
-H "Content-Type: application/json" \
-d '{"model":"openwar","messages":[{"role":"user","content":"hi"}]}' \
-D - | grep -i x-openwar-trace-id | cut -d: -f2 | tr -d ' \r\n')(v0.13.0 caveat: Aider's tool-use features will not work end-to-end yet. Plain-text chat works.)