Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
f9715cd
docs(spike): scope Codex CLI feasibility
hannasdev May 13, 2026
9e62a93
test(spike): verify Codex CLI resume routing
hannasdev May 13, 2026
21b86df
docs(spike): clarify Codex parity bar
hannasdev May 13, 2026
89300dc
docs(spike): record Codex app-server switch evidence
hannasdev May 13, 2026
311f234
test(spike): add Codex app-server switch probe
hannasdev May 13, 2026
2457e1b
docs(spike): define Codex app-server supportability gates
hannasdev May 13, 2026
51f6411
docs(spike): check Codex app-server public surface
hannasdev May 13, 2026
e2459df
test(spike): check Codex app-server protocol shape
hannasdev May 13, 2026
c9650cc
test(spike): record Codex model evidence limits
hannasdev May 13, 2026
bfa44ed
test(spike): add Codex app-server preflight
hannasdev May 14, 2026
1fbb167
test(spike): verify Codex app-server lifecycle
hannasdev May 14, 2026
6f91841
docs(readme): clarify experimental usage paths
hannasdev May 14, 2026
8d24843
docs(readme): prioritize early adopter usage
hannasdev May 14, 2026
2a7e0d0
docs(readme): frame routing as product goal
hannasdev May 14, 2026
c8b68a8
docs(spike): close Codex app-server product gate
hannasdev May 14, 2026
a9ee1d6
fix(spike): address Codex probe review comments
hannasdev May 14, 2026
cb13d74
fix: verify lifecycle protocol errors
hannasdev May 14, 2026
550bc38
fix: tighten codex probe evidence
hannasdev May 14, 2026
7913e1f
fix: require observed codex model evidence
hannasdev May 14, 2026
61fcf38
fix: clarify codex probe verification
hannasdev May 14, 2026
4925e2c
fix: bound codex cli spike commands
hannasdev May 14, 2026
4fb8859
fix: harden codex app-server diagnostics
hannasdev May 14, 2026
a96c48b
fix: fail partial live codex cli probes
hannasdev May 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ Read in this order:
Read in this order:
1. [docs/product/ROUTER-PHASE-PLAN.md](docs/product/ROUTER-PHASE-PLAN.md)
2. [docs/decision-log.md](docs/decision-log.md)
3. [docs/REPLAY-GUIDE.md](docs/REPLAY-GUIDE.md)
3. [docs/product/CODEX-CLI-SPIKE-SCOPE.md](docs/product/CODEX-CLI-SPIKE-SCOPE.md)
4. [docs/REPLAY-GUIDE.md](docs/REPLAY-GUIDE.md)

### I need security and risk context

Expand Down
153 changes: 128 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,17 @@

[![npm version](https://img.shields.io/npm/v/model-switchboard.svg)](https://www.npmjs.com/package/model-switchboard) [![npm downloads](https://img.shields.io/npm/dm/model-switchboard.svg)](https://www.npmjs.com/package/model-switchboard) [![CI](https://github.com/hannasdev/model-switchboard/actions/workflows/ci.yml/badge.svg)](https://github.com/hannasdev/model-switchboard/actions/workflows/ci.yml) [![Release](https://github.com/hannasdev/model-switchboard/actions/workflows/release.yml/badge.svg)](https://github.com/hannasdev/model-switchboard/actions/workflows/release.yml) [![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/hannasdev/model-switchboard/badge)](https://securityscorecards.dev/viewer/?uri=github.com/hannasdev/model-switchboard) [![OpenSSF Best Practices](https://www.bestpractices.dev/projects/12820/badge)](https://www.bestpractices.dev/projects/12820) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

Model Switchboard is a routing layer for AI-assisted software delivery.
Model Switchboard is an experimental routing layer for AI-assisted software delivery.

It keeps coding sessions moving by choosing model and effort settings before each turn, so you do not have to make that call manually every time.
Its goal is to keep coding sessions moving by choosing model and effort settings before each turn, so you do not have to make that call manually every time.

## Get, Provide Feedback, and Contribute

- Obtain the software:
- GitHub repository: https://github.com/hannasdev/model-switchboard
- npm package: https://www.npmjs.com/package/model-switchboard
- Provide feedback (bug reports and enhancements):
- Issues: https://github.com/hannasdev/model-switchboard/issues
- Contribute to the project:
- Contribution guide: [CONTRIBUTING.md](CONTRIBUTING.md)
The project is still exploring the product shape for automatic model hot-swapping. It is useful today as a Claude Code routing wrapper and as a Codex feasibility spike, but it is not yet a polished replacement for an existing AI coding workflow.

## Why It Exists

Choosing the right model repeatedly is a real cognitive tax. A single coding session can shift between quick clarifications, planning, implementation, and debugging, each with different cost and quality needs.

Model Switchboard reduces that overhead with consistent routing decisions and a short explanation of why a route was selected.
Model Switchboard explores reducing that overhead with consistent routing decisions and a short explanation of why a route was selected.

## Core Value

Expand All @@ -31,7 +23,7 @@ Model Switchboard reduces that overhead with consistent routing decisions and a

## Current Product Slice

The current MVP is a Claude Code workflow integration powered by a separable router core.
The current MVP is a Claude Code workflow integration powered by a separable router core. The Codex work is an active spike to test whether Switchboard can go beyond advisory routing and actually control per-turn model changes inside one continuous session.

High-level flow:

Expand All @@ -40,26 +32,109 @@ High-level flow:
3. Switchboard launches or resumes Claude with matching model and effort settings for that launch.
4. Route context, session state, and hook evidence are recorded for explainability, replay, and governance.

## What It Is Not
## Usage Paths

- Not a replacement for your coding client.
- Not a general-purpose agent runtime.
- Not a cross-vendor orchestration product in this MVP phase.
Switchboard currently has three distinct paths. They are intentionally not equivalent.

## Security & Code Quality
### Claude Code Wrapper

This project prioritizes security for AI-related software:
This is the most complete path today.

- **Vulnerability Scanning**: Automated dependency scanning via `npm audit` in CI on pull requests and pushes to `main`, plus [Snyk](https://snyk.io) scans on pushes to `main` and a daily schedule when `SNYK_TOKEN` is configured
- **Static Analysis**: ESLint with security plugin to detect common vulnerabilities
- **Responsible Disclosure**: Follow the [Security Policy](SECURITY.md) to report vulnerabilities privately
- **Test Coverage**: Comprehensive test suite validates security-relevant code paths
- **Developer Knowledge**: Core team has expertise in secure software design and threat modeling
Use:

See [SECURITY.md](SECURITY.md) for details on the vulnerability reporting process and security practices.
```bash
switchboard "your prompt"
switchboard --interactive
switchboard explain
```

What it allows:

- Routes each prompt before launching or resuming Claude.
- Applies model and effort choices at Claude launch/resume boundaries.
- Records local routing evidence for explainability and replay.

Advantages:

- Most productized workflow in this repository.
- Uses the existing Claude Code user experience.
- Good fit for prompt-by-prompt routing with auditability.

Does not yet support:

- Automatic model changes inside an already-running Claude interactive session.
- Eliminating the cognitive overhead of model choice during a long-lived stock Claude TUI session.

### Advisory Cross-Surface Routing

This path gives a recommendation without taking over execution.

Use:

```bash
switchboard advise --surface openai-codex "your prompt"
```

What it allows:

- Asks Switchboard what it would choose for a target surface.
- Lets you keep using another client manually.

Advantages:

- Low-risk way to test routing policy across vendors or clients.
- Does not require Switchboard to own the session process.

Does not yet support:

- Automatic execution.
- Automatic in-session model switching.
- Reducing all model-selection overhead, because the user still has to apply the recommendation.

### Codex App-Server Spike

This is the experimental hot-swapping path.

Use:

```bash
npm run switchboard:spike:codex-app-server:preflight
npm run switchboard:spike:codex-app-server:protocol
npm run switchboard:spike:codex-app-server:lifecycle
npm run switchboard:spike:codex-app-server
```

What it allows:

- Starts `codex app-server --listen stdio://`.
- Creates one Codex app-server thread.
- Sends multiple `turn/start` requests on that same thread.
- Requests different models on different turns without a `codex exec resume` boundary.

Advantages:

- This is the only current path that suggests Switchboard could go beyond Claude parity.
- It demonstrates a possible Switchboard-owned session surface with per-turn model override.
- It preserves one app-server thread/session while route-selected model requests change.

Does not yet support:

- A polished end-user UI.
- Hot-swapping inside the stock Codex TUI.
- Production stability guarantees, because the app-server surface is still experimental.
- Provider-side backend model attestation; current evidence proves requested model overrides and same-thread completion, not a durable backend model field.

## What It Is Not

- Not a finished replacement for your coding client.
- Not a general-purpose agent runtime.
- Not a claim that stock Claude or stock Codex TUI sessions can be hot-swapped today.
- Not a production-grade cross-vendor orchestration product in this MVP phase.

## Primary Commands

The commands below mix productized MVP commands and spike commands. Commands containing `spike` are feasibility evidence for the Codex direction, not polished product UX.

| Command | What It Does | Use It When |
| --- | --- | --- |
| `switchboard "your prompt"` | Routes a single prompt, chooses target/effort, then launches or resumes Claude for that turn. | You want normal prompt-driven usage with routing applied automatically. |
Expand All @@ -68,6 +143,12 @@ See [SECURITY.md](SECURITY.md) for details on the vulnerability reporting proces
| `switchboard advise --surface openai-codex "your prompt"` | Returns an advisory routing recommendation for a selected surface without taking over execution. | You want a cross-surface recommendation or policy check before running a turn. |
| `switchboard probe continuity` | Runs a continuity probe for prompt-driven turns and reports whether session continuity checks pass. | You want to verify non-interactive continuity behavior after changes. |
| `switchboard probe continuity-interactive` | Runs the interactive continuity probe and verifies resume/session behavior across turns. | You want to validate interactive continuity and related checks. |
| `npm run switchboard:spike:codex-cli` | Inspects the local Codex CLI command surface and maps two routed turns to Codex `exec`/`resume --model` plans without making live model calls. | You want a product-aligned feasibility signal for Codex CLI route authority before building a deeper integration. |
| `npm run switchboard:spike:codex-cli:live` | Runs the bounded two-turn Codex CLI resume probe with route-selected models and captures JSON/session evidence. | You are ready to collect live evidence for the Codex CLI feasibility spike. |
| `npm run switchboard:spike:codex-app-server:preflight` | Verifies the local Codex CLI version, app-server command availability, login status, and redacted app-server auth evidence before a routed session starts. | You want to check whether a normal user install can support the Codex app-server spike path. |
| `npm run switchboard:spike:codex-app-server:protocol` | Generates Codex app-server TypeScript bindings and verifies the minimum protocol shape Switchboard depends on. | You are checking whether Codex app-server protocol changes would break the feasibility spike. |
| `npm run switchboard:spike:codex-app-server:lifecycle` | Starts Codex app-server, verifies protocol-error handling, completes one turn, interrupts a second turn, captures stderr/malformed-output evidence, and shuts the process down. | You are checking whether Switchboard can safely own the app-server process lifecycle. |
| `npm run switchboard:spike:codex-app-server` | Runs the bounded app-server in-session switch probe with one thread and two route-selected `turn/start` model overrides. | You are evaluating whether Codex app-server can be a Switchboard-controlled session surface beyond `exec`/`resume` parity. |
| `npm test` | Runs the full automated test suite for adapters, router, workflow, and CLI behavior. | You changed routing/workflow/docs and want a full regression check. |

### Interactive Mode Clarification
Expand All @@ -88,3 +169,25 @@ If you want per-turn re-routing and potential target/model changes, run prompts
`npm test`

For detailed command documentation, environment variables, and output formats, see [CLI Reference](docs/CLI-REFERENCE.md).

## Get, Provide Feedback, and Contribute

- Obtain the software:
- GitHub repository: https://github.com/hannasdev/model-switchboard
- npm package: https://www.npmjs.com/package/model-switchboard
- Provide feedback (bug reports and enhancements):
- Issues: https://github.com/hannasdev/model-switchboard/issues
- Contribute to the project:
- Contribution guide: [CONTRIBUTING.md](CONTRIBUTING.md)

## Security & Code Quality

This project prioritizes security for AI-related software:

- **Vulnerability Scanning**: Automated dependency scanning via `npm audit` in CI on pull requests and pushes to `main`, plus [Snyk](https://snyk.io) scans on pushes to `main` and a daily schedule when `SNYK_TOKEN` is configured
- **Static Analysis**: ESLint with security plugin to detect common vulnerabilities
- **Responsible Disclosure**: Follow the [Security Policy](SECURITY.md) to report vulnerabilities privately
- **Test Coverage**: Comprehensive test suite validates security-relevant code paths
- **Developer Knowledge**: Core team has expertise in secure software design and threat modeling

See [SECURITY.md](SECURITY.md) for details on the vulnerability reporting process and security practices.
1 change: 1 addition & 0 deletions docs/PRD.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The PRD content has been split into focused documents so each read is shorter an
- Router contracts: [contracts/router-contracts.md](contracts/router-contracts.md)
- MVP product scope: [product/MVP-PRD.md](product/MVP-PRD.md)
- Router phase execution plan: [product/ROUTER-PHASE-PLAN.md](product/ROUTER-PHASE-PLAN.md)
- Codex CLI feasibility spike scope: [product/CODEX-CLI-SPIKE-SCOPE.md](product/CODEX-CLI-SPIKE-SCOPE.md)
- Decision history: [decision-log.md](decision-log.md)
- Replay and evaluation guide: [REPLAY-GUIDE.md](REPLAY-GUIDE.md)

Expand Down
Loading