English | 简体中文
Let your coding agent pick the right model — automatically.
One endpoint for every coding agent. The right model for every task.
ModelPick is a local LLM routing gateway. Point Claude Code, Codex CLI, Gemini CLI, Cursor, Cline, and OpenCode at a single endpoint — ModelPick decides which provider and model serves each request. It can pick the cheapest model that's still up to the task, or you can write your own rules to favor quality over cost (or the other way around).
Local-first · Multi-harness · Task-aware routing · Configurable rules · Explainable & replayable
🚧 Under active design & development. This is the RFC stage — the install command and config below are a preview. Feedback welcome.
# One-line install
curl -fsSL https://get.modelpick.dev | sh
# Point your coding agents at it
export ANTHROPIC_BASE_URL=http://localhost:8788
export OPENAI_BASE_URL=http://localhost:8788From this moment, your coding agent is no longer locked to a single model.
One local endpoint → smart pick by task · context · budget · tool-call · cache · failover · eval.
For every request, the router weighs:
- Task type — coding / planning / classification / summarization / tool-call / vision
- Context length — anything over 50K tokens skips short-context models
- Tool-call compatibility — knows which models reliably do function calling and structured output
- Budget and rate limits — auto-downgrades before per-user / per-project / per-agent quota runs out
- Prompt cache hits — prefers reuse of the same provider's cache prefix
- Failover state — switches in seconds when an upstream wobbles
- Your policy + historical eval feedback — which models actually work on your real tasks
Every decision is logged, explainable, and replayable. You can ask: "Why did the last code review use Sonnet instead of Opus?" — and ModelPick answers.
# ~/.config/modelpick/config.yaml
providers:
- name: anthropic
type: anthropic
api_key: $ANTHROPIC_API_KEY
- name: openai
type: openai
api_key: $OPENAI_API_KEY
- name: deepseek
type: openai_compatible
base_url: https://api.deepseek.com/v1
api_key: $DEEPSEEK_API_KEY
- name: ollama
type: openai_compatible
base_url: http://localhost:11434/v1
routes:
# Long-context coding: frontier model
- when: { task: coding, context_tokens: ">=50000" }
use: anthropic/claude-opus-4-7
# Trivial classification, lint fix, commit message: cheap model
- when: { task: [classification, lint_fix, commit_message] }
use: deepseek/deepseek-chat
# Tool-heavy: best tool-call compatibility
- when: { task: tool_call, tool_count: ">=5" }
use: openai/gpt-5
# Default: daily coding goes to Sonnet
- default: anthropic/claude-sonnet-4-6
fallback:
on_rate_limit: openai/gpt-5
on_error: anthropic/claude-sonnet-4-6
on_quality_below: 0.7 # driven by eval feedback
budget:
daily_usd_limit: 50
per_project_limit:
repo-a: 20
repo-b: 30All LLM requests go to localhost:8788. ModelPick rewrites each request to the target provider's format, routes, falls back, and records the trace.
- Local-first — Service runs on your machine or your team's network. Code, prompts, and traces stay local unless you explicitly opt into cloud observability.
- Decouple harness from model — Just point your agent's Base URL at the local address; ModelPick takes over.
- Explainable routing — Every decision can answer "why this model?". No opaque black-box router model.
- Eval feedback loop — Thumbs up/down on responses feeds back into routing policy. Over time the router learns your codebase.
License: MIT (planned) Status: RFC · Open for design feedback