Use Fireworks AI models in Claude Code, OpenCode, Codex, Pi, Cursor, and VS Code.
Install in one line:
sh -c "$(curl -fsSL https://raw.githubusercontent.com/fw-ai/fireconnect/main/install.sh)"Or with bash directly:
curl -fsSL https://raw.githubusercontent.com/fw-ai/fireconnect/main/install.sh | bashInstall the fireconnect CLI once, then use it to manage Fireworks routing for Claude Code, OpenCode, Codex, Pi, Cursor, and VS Code. Run fireconnect help to see what it can do.
Run this from a terminal:
curl -fsSL https://raw.githubusercontent.com/fw-ai/fireconnect/main/install.sh | bashFor non-interactive setup:
curl -fsSL https://raw.githubusercontent.com/fw-ai/fireconnect/main/install.sh | FIREWORKS_API_KEY="fw_..." bashFire Pass users can use a fpk_... key directly — FireConnect detects the key type and
uses the correct defaults for Fire Pass (glm-latest for all aliases).
If you prefer installing from an SSH checkout:
mkdir -p ~/.fireconnect && git clone git@github.com:fw-ai/fireconnect.git ~/.fireconnect && bash ~/.fireconnect/install.shThe installer:
- Uses Node.js to update Claude Code settings. If Node.js is missing, asks before installing it with Homebrew or apt. It does not install or update npm packages.
- Points you to the Fireworks API key page and prompts once for your Fireworks API key.
- Applies the default model mapping and writes Claude Code settings.
- Installs the
fireconnectCLI launcher to~/.local/binand adds it to your shellPATH.
Then fully restart Claude Code and test with:
hi
Default models:
main -> glm-latest
opus -> glm-latest
sonnet -> glm-5p1
haiku -> deepseek-v4-flash
subagent -> deepseek-v4-flash
Create a Fireworks API key here:
https://app.fireworks.ai/settings/users/api-keys
Then enable Fireworks routing from a terminal:
fireconnect claude on --api-key fw_... # also saves key to ~/.fireconnect/config.jsonRestart Claude Code after this completes.
The setup writes these Claude Code settings:
{
"env": {
"ANTHROPIC_BASE_URL": "https://api.fireworks.ai/inference",
"ANTHROPIC_API_KEY": "fw_YOUR_FIREWORKS_API_KEY",
"ANTHROPIC_AUTH_TOKEN": "fw_YOUR_FIREWORKS_API_KEY",
"ANTHROPIC_MODEL": "accounts/fireworks/routers/glm-latest[1m]",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "accounts/fireworks/routers/glm-latest[1m]",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "accounts/fireworks/models/glm-5p1",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "accounts/fireworks/models/deepseek-v4-flash",
"CLAUDE_CODE_SUBAGENT_MODEL": "accounts/fireworks/models/deepseek-v4-flash"
}
}The setup writes both ANTHROPIC_API_KEY (preferred) and ANTHROPIC_AUTH_TOKEN (compatibility alias) with the same Fireworks key. It saves a backup of your previous provider settings so fireconnect claude off can restore them.
Short model IDs are accepted everywhere. For example, glm-latest is written to Claude Code settings as accounts/fireworks/routers/glm-latest[1m].
Cursor stores its AI settings in a SQLite database (state.vscdb), not a JSON file, so the Cursor harness writes there directly:
- API key ->
cursorAuth/openAIKey - Base URL ->
openAIBaseUrl(set tohttps://api.fireworks.ai/inference/v1, Cursor's OpenAI-compatible endpoint) - Custom models ->
aiSettings.userAddedModels+aiSettings.modelOverrideEnabled - Per-mode model ->
aiSettings.modelConfig[mode](e.g.composer,cmd-k)
cursor on sets every mode that already exists in modelConfig to the default Fireworks model (non-destructive — it won't create mode entries that aren't already there). Use cursor model select --mode <mode> to point an individual mode at a different model. status lists every supported mode (the valid values for --mode, with the default marked) and a human-readable model name per mode (e.g. GLM 5.2, GLM Latest); status --json returns raw ids plus defaultMode.
fireconnect cursor on --api-key fw_... # quit Cursor first; sets all existing modes
fireconnect cursor status # read-only; works while Cursor is open
fireconnect cursor model list --search glm
fireconnect cursor model select --mode composer
fireconnect cursor off # restores your previous settingsQuit Cursor (Cmd-Q / File > Quit) before on, off, model select, model add, or model reset — otherwise Cursor's in-memory state overwrites the write on next flush. In an interactive terminal, if Cursor is still running fireconnect asks you to quit it and press Enter to continue; if Cursor is still running after that it errors out (it does not close or reopen Cursor for you). Non-interactive use (piped/CI) still requires Cursor to be quit ahead of time. status and model list are read-only and work any time. Pass --force to write anyway without waiting. off only removes models FireConnect registered (tracked under aiSettings.fireconnectAddedModels); your own custom models are preserved.
VS Code Chat's custom language models are configured in chatLanguageModels.json (a JSON array of providers). fireconnect adds a Fireworks provider (vendor customendpoint, apiType: chat-completions) whose models point at https://api.fireworks.ai/inference (VS Code appends /v1/chat/completions).
The API key is not stored in the JSON — VS Code stores it in the OS keychain (macOS Keychain / Windows Credential Manager / libsecret on Linux) and the JSON holds a ${input:chat.lm.secret.<id>} reference. fireconnect vscode on writes both: the provider entry to chatLanguageModels.json and the key to the keychain under VS Code's product.nameLong service (e.g. Visual Studio Code).
fireconnect vscode on --api-key fw_... # quit VS Code first
fireconnect vscode status # read-only; works while VS Code is open
fireconnect vscode model list --search glm
fireconnect vscode model add deepseek-v4-flash
fireconnect vscode model select
fireconnect vscode off # restores chatLanguageModels.json + removes the keyNo restart needed — VS Code watches chatLanguageModels.json and hot-reloads provider changes (including the keychain-resolved API key), so on/off/model add/model select/model reset take effect immediately in the Chat picker. Quit VS Code only to avoid a concurrent-edit clobber (if you edit models in VS Code's UI at the same moment fireconnect writes). status and model list are read-only and work any time.
Per-model toolCalling/vision/maxInputTokens/maxOutputTokens are defined alongside serverless pricing in packages/setup-cli/lib/fireworks-model-specs.mjs (sourced from the Fireworks model library and API). Unmapped models default to toolCalling: true and vision: false; token limits are omitted until the model is added to the specs registry. VS Code sends maxOutputTokens as max_output_tokens, so mapped values must not exceed the model limit.
On macOS the first time VS Code reads the fireconnect-written keychain entry it may show a Keychain access prompt — click Always Allow once. On Linux, libsecret (secret-tool) must be installed. --keychain-service overrides the service name for custom installs / Insiders. off restores your original chatLanguageModels.json and deletes the chat.lm.secret.fw-* keychain entry; any providers you configured manually are preserved.
After fireconnect claude on, FireConnect prints hints for browsing the Fireworks catalog and
picking a model interactively.
fireconnect claude model list # browse callable serverless endpoints
fireconnect claude model select # pick a model for Claude Code
fireconnect claude model select --slot sonnet # update one Claude Code alias
fireconnect opencode model select # pick OpenCode's default model
fireconnect codex model select # pick Codex's default model
fireconnect cursor model select --mode composer # pick Cursor's composer modelHarness-scoped: lists the Fireworks serverless catalog using the API key resolved from that
harness. Fetches serverless models from the Fireworks API (supports_serverless=true) and merges
the known public platform routers (glm-latest, glm-fast-latest, glm-5p2-fast, kimi-fast-latest, kimi-latest, kimi-k2p6-turbo, and kimi-k2p7-code-fast). Every row is
tagged serverless (on-demand endpoints will be added later).
fireconnect claude model list
fireconnect claude model list --search glm
fireconnect opencode model list --jsonResolves the key in documented order: --api-key, then the harness's stored key, then
~/.fireconnect/config.json, then FIREWORKS_API_KEY. Non-Fireworks-shaped keys (for example
Anthropic sk-ant-...) are skipped when resolving harness-local keys.
Fire Pass keys (fpk_...) show Fire Pass-supported routers: glm-latest, glm-fast-latest, glm-5p2-fast, kimi-fast-latest, and kimi-k2p7-code-fast.
Interactive picker. Requires a terminal and Fireworks to be enabled for that harness.
Claude Code — pick one of five aliases (main, opus, sonnet, haiku, subagent):
fireconnect claude model select
fireconnect claude model select --slot sonnet
fireconnect claude model select --slot sonnet --search glmOpenCode — single default model (no --slot):
fireconnect opencode model select
fireconnect opencode model select --search glmCursor — pick a Cursor mode (composer, cmd-k, background-composer, …); defaults to composer (no --slot):
fireconnect cursor model select
fireconnect cursor model select --mode composer --search glmVS Code — pick a model to add to the Fireworks provider (no --slot/--mode):
fireconnect vscode model select
fireconnect vscode model select --search glm| Command | Shows |
|---|---|
fireconnect claude status |
Your current provider, auth, configured alias mapping, and Fireworks serverless rates per slot |
fireconnect claude model list |
Available serverless endpoints from the Fireworks API, with IN / OUT pricing where known |
Claude Code’s /model picker and session cost estimates use Anthropic list prices (for
example $5 / $25 per Mtok on the default Opus-style mapping for glm-latest). Fireworks
bills at serverless model rates (for example about $1.40 / $4.40 per Mtok for GLM 5.2 on
standard serverless) — the UI estimate can be much higher than your real bill.
FireConnect cannot override Claude Code’s price column. Use fireconnect claude status and
fireconnect claude model list for Fireworks rates, and check the
Fireworks billing dashboard for actual spend.
Full explanation: docs/claude-code-pricing.md.
After fireconnect claude on, model select, or model reset, settings.json is updated
immediately. To use the new model in Claude Code, run /model to activate it in the same
session, start a new session, or /exit and resume the conversation with claude --resume <id>.
Short IDs are accepted everywhere and are normalized to their full paths automatically.
| Short ID | Best for | Notes |
|---|---|---|
glm-latest |
All-around use, agentic tasks | Default for main and opus slots. Version-tracking router; strong reasoning, 1M context. |
glm-fast-latest |
Latency-sensitive agentic use | Version-tracking router on the high-speed Fast serving path (100+ tok/s), at a higher per-token price. 1M context. |
glm-5p2-fast |
Latency-sensitive agentic use | Same as glm-fast-latest but pinned to GLM 5.2 rather than version-tracking. 1M context. |
glm-5p1 |
General use (lighter) | Default sonnet slot. Good balance of speed and quality. |
deepseek-v4-flash |
Background / fast tasks | Default haiku and subagent slots. Lowest latency. |
Fire Pass keys (fpk_...): all slots default to glm-latest.
Switching a single slot (Claude Code only):
fireconnect claude model select --slot opus # pick a model interactively
fireconnect claude model select --slot sonnet # pick general model interactively
fireconnect claude on --sonnet glm-5p1 --haiku deepseek-v4-flash # set non-interactivelyOpenCode and Pi use a single default model; use fireconnect <harness> model select or pass --main <slug> to on.
The CLI is harness-first: fireconnect <harness> <command>. A handful of commands are
global (no harness). Commands below are listed in the same order as fireconnect help.
Global
fireconnect configure Register harnesses and API key preferences.
fireconnect uninstall Disable + restore all harnesses, then remove FireConnect.
fireconnect help Show help.
fireconnect configure stores a provided API key in ~/.fireconnect/config.json.
<harness> on reads that global key (or FIREWORKS_API_KEY) and writes harness-local auth.
Per harness (claude, opencode, codex, pi)
fireconnect <harness> on Route the harness through Fireworks (default if no command).
fireconnect <harness> off Restore your previous provider/config.
fireconnect <harness> status Show the provider, auth, and model mapping.
fireconnect <harness> model list Browse serverless Fireworks models.
fireconnect <harness> model select Interactive model picker.
fireconnect <harness> model reset Reset models to defaults.
fireconnect <harness> help Show help for that harness.
Run fireconnect help for the overview, or fireconnect claude help / fireconnect opencode help /
fireconnect codex help / fireconnect pi help for everything available at the harness level.
FireConnect routes OpenAI Codex CLI through Fireworks via the Responses API:
export FIREWORKS_API_KEY=fw_...
fireconnect codex on # route Codex through Fireworks (~/.codex/config.toml)
fireconnect codex on --api-key fw_... # pass key once; later model commands reuse it
fireconnect codex status # check current provider and model
fireconnect codex on --main glm-5p1 # switch model (non-interactive)
fireconnect codex model select # switch model (interactive)
fireconnect codex off # restore your original configWhat it does:
- Sets root
model_provider/modelfor Codex 0.134+ and adds a[model_providers.fireworks-ai]block withwire_api = "responses". With--api-keyor a key from~/.fireconnect/config.json, FireConnect writesexperimental_bearer_tokenso latermodel list/select/resetwork without passing the key again. With onlyFIREWORKS_API_KEYin the environment, it writesenv_key = "FIREWORKS_API_KEY"instead. - Snapshots your original
~/.codex/config.tomlbefore the first change.fireconnect codex offrestores it byte-for-byte. The snapshot lives in~/.fireconnect/codex/. - Preserves unrelated Codex settings (for example
[[mcp_servers]]) via surgical TOML edits.
After fireconnect codex on, off, model select, or model reset, config.toml is updated
immediately. To use the updated routing in Codex, run /model to activate it in the same
session, start a new session, or /exit and resume the conversation with codex resume <id>.
FireConnect routes OpenCode through Fireworks with harness-first commands:
export FIREWORKS_API_KEY=fw_...
fireconnect opencode on # route OpenCode through Fireworks
fireconnect opencode status # check current provider
fireconnect opencode on --main glm-5p1 # switch model (non-interactive)
fireconnect opencode model select # switch model (interactive)
fireconnect opencode off # restore your original configWhat it does:
- Merges a
provider.fireworksblock into~/.config/opencode/opencode.json(the OpenAI-compatible adapter pointed athttps://api.fireworks.ai/inference/v1) and sets the defaultmodel. Your other providers are left untouched. - Snapshots your original
opencode.jsonbefore the first change.fireconnect opencode offrestores it byte-for-byte (formatting, key order, everything). The snapshot lives in~/.fireconnect/opencode/. - Keeps secrets out of the config file: when the key comes from the
FIREWORKS_API_KEYenvironment variable, it is written as the{env:FIREWORKS_API_KEY}reference. Passing--api-keyexplicitly writes the literal key instead. OpenCode'sauth.jsonis never touched.
Use --config-path <path> to target a non-default config file (also handy for testing
without touching your real config). Run fireconnect help for the full CLI reference.
OpenCode also supports routing through Fireworks models on Microsoft Foundry (Azure) — see Azure (Microsoft Foundry) endpoints.
FireConnect routes Pi through Fireworks with harness-first commands:
export FIREWORKS_API_KEY=fw_...
fireconnect pi on # route Pi through Fireworks
fireconnect pi status # check current provider
fireconnect pi on --main glm-5p1 # switch model (non-interactive)
fireconnect pi model select # switch model (interactive)
fireconnect pi off # restore your original settings and authWhat it does:
- Sets
defaultProvider/defaultModelin~/.pi/agent/settings.jsonand stores the API key in~/.pi/agent/auth.json($FIREWORKS_API_KEYwhen from env, literal with--api-key).onapplies the default model (glm-latest) unless you pass--main. - Snapshots both files under
~/.fireconnect/pi/before the first change.fireconnect pi offrestores them byte-for-byte.auth.jsonis written at mode0600. - Restart Pi after
on,model select,model reset, oroffwhen Pi is already running.
Use --settings-path <path> to target a non-default settings file.
Pi also supports routing through Fireworks models on Microsoft Foundry (Azure) — see Azure (Microsoft Foundry) endpoints.
Fireworks AI models are also available as first-party models inside Microsoft Foundry (formerly Azure AI Foundry), where usage is billed through Azure and counts toward your MACC. Foundry exposes an OpenAI-compatible endpoint, so the OpenAI-compatible harnesses — OpenCode, Codex, and Pi — can route through your Foundry resource instead of the Fireworks gateway.
Configure the endpoint once, then <harness> on leverages it — no per-command flags:
fireconnect configure --harnesses opencode,codex,pi --provider azure \
--base-url https://<resource>.services.ai.azure.com \
--api-key <azure-api-key>
fireconnect opencode on # routes through the configured Foundry endpoint
fireconnect codex on
fireconnect pi onconfigure stores a top-level provider and azure endpoint in
~/.fireconnect/config.json; this design extends to future providers without touching the
harnesses. To switch back, run fireconnect configure --provider fireworks ....
You can also opt in per-command (or override the configured endpoint) with --azure:
fireconnect opencode on --azure --base-url https://<resource>.services.ai.azure.com \
--api-key <azure-api-key> --main FW-GLM-5.1Common behavior across harnesses:
- Endpoint. Pass your Foundry endpoint to
--base-url. FireConnect normalizes whatever you paste — the bare resource root, the portal project endpoint (.../api/projects/<name>), or the/modelsroute — to the correct resource-root basehttps://<resource>.services.ai.azure.com/openai/v1. Find the endpoint in the Microsoft Foundry portal under Project settings. - Auth. Authenticate with your Azure API key (not a
fw_/fpk_key). Pass--api-keyto write it literally, or exportAZURE_API_KEYto have it written as an environment reference instead. - Model. The model id is your Foundry deployment name — the catalog model name
without the
fireworks-ai/publisher prefix (e.g.FW-GLM-5.1,FW-MiniMax-M2.5). Defaults toFW-GLM-5.1; pass--main <foundry-deployment-name>to select another. - Provider isolation + restore. Each harness writes a dedicated
fireworks-azureprovider distinct from the Fireworks gateway, andoffrestores your original config byte-for-byte. Switching between Fireworks and Azure modes replaces the managed provider cleanly.
Per-harness specifics:
| Harness | Writes | Provider |
|---|---|---|
| OpenCode | provider.fireworks-azure in opencode.json (@ai-sdk/openai-compatible, options.baseURL + options.apiKey) |
fireworks-azure/<deployment> |
| Codex | [model_providers.fireworks-azure] in config.toml (wire_api = "chat", bearer or env_key = "AZURE_API_KEY") |
fireworks-azure |
| Pi | custom openai-completions provider in models.json (baseUrl, authHeader, apiKey literal or $AZURE_API_KEY) + defaultProvider in settings.json |
fireworks-azure |
fireconnect <harness> status reports azure as the provider along with the endpoint and
model.
Claude Code is intentionally excluded: its harness speaks the Anthropic Messages API, which Foundry does not expose.
model list/model selectread the Fireworks catalog and are not used in Azure mode — select a Foundry deployment with--main.