OpenRouter CLI with persistent memory, streaming, and multi-model support — powered by LangChain 1.3.3 + Bun
Read this in other languages: Tiếng Việt · 中文 · 日本語
- Persistent memory — conversation history saved to SQLite, survives restarts
- Streaming — token-by-token output, first token in ~300ms
- Token optimization —
InMemoryCachededuplicates identical prompts;ConversationTokenBufferMemoryauto-trims history to stay within token budget - Multi-model — access 200+ models via OpenRouter (GPT-4o, Claude, Gemini, Llama, etc.)
- Multiple output formats —
pretty,json,ndjson,table - Secure auth — API key stored in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
- Config cascade — CLI flags → env vars →
.envfile →~/.config/ormax/config.toml→ keychain → defaults
- Bun ≥ 1.2.0
- OpenRouter API key
# Clone and install
git clone https://github.com/your-username/openrouter-max-cli.git
cd openrouter-max-cli
bun install
# Set your API key
export OPENROUTER_API_KEY="sk-or-..."
# or save permanently:
bun run src/main.ts auth set-keyBuild standalone binary:
bun run build
# produces ./bin/ormax# Simple chat
bun run src/main.ts chat send "What is the speed of light?"
# With memory (remembers context across calls)
bun run src/main.ts chat send "My name is Alice" --session alice
bun run src/main.ts chat send "What is my name?" --session alice
# Use a specific model
bun run src/main.ts chat send "Explain quantum entanglement" --model anthropic/claude-haiku-4-5
# One-shot, no memory, pipe input
echo "Summarize this in 3 bullets" | bun run src/main.ts chat send --no-memory
# JSON output (for scripting)
bun run src/main.ts chat send "hello" --output json --no-stream
# List available models
bun run src/main.ts models list
bun run src/main.ts models list --search claude| Command | Description |
|---|---|
chat send [message] |
Send a message. Prompts interactively if omitted |
Options:
| Flag | Alias | Description | Default |
|---|---|---|---|
--model |
-m |
Model ID | openai/gpt-4o-mini |
--session |
-s |
Session ID for memory | default |
--system |
System prompt override | ||
--max-tokens |
Max tokens in response | ||
--no-stream |
Disable streaming | ||
--no-memory |
Skip conversation history | ||
--output |
-o |
pretty | json | ndjson | table |
pretty |
| Command | Description |
|---|---|
models list |
List all available models |
models list --search <query> |
Filter models by name or ID |
models get <id> |
Get details for a specific model |
| Command | Description |
|---|---|
auth set-key [key] |
Save API key to OS keychain |
auth status |
Show current authentication status |
auth logout |
Remove API key from keychain |
| Command | Description |
|---|---|
config get <key> |
Get a config value (e.g. defaults.model) |
config set <key> <value> |
Set a config value |
config list |
Show all config values |
config path |
Show config file path |
| Command | Description |
|---|---|
memory list |
List all conversation sessions |
memory clear [session] |
Clear a session (or all with --all) |
Config file: ~/.config/ormax/config.toml
[auth]
# apiKey = "sk-or-..." # Not recommended — use keychain or env var
[defaults]
model = "openai/gpt-4o-mini"
temperature = 0.7
systemPrompt = "You are a helpful assistant."
session = "default"
output = "pretty"
[behavior]
stream = true
noMemory = false
maxTokenLimit = 4000 # auto-trim history beyond this token count
cacheResponses = false # cache identical prompts across sessionsEnvironment variables:
OPENROUTER_API_KEY=sk-or-...
OPENROUTER_MODEL=anthropic/claude-3-5-sonnet| Code | Meaning |
|---|---|
0 |
Success |
1 |
Unexpected error |
2 |
Invalid arguments |
64 |
Missing API key |
65 |
HTTP 4xx client error |
66 |
HTTP 5xx server error |
67 |
Rate limited (HTTP 429) |
68 |
Timeout |
69 |
Model not found |
70 |
Stream interrupted by SIGINT |
src/
├── main.ts # Citty CLI entry point
├── commands/ # chat, models, config, auth, memory
└── lib/
├── chain/
│ ├── model.ts # ChatOpenRouter + InMemoryCache
│ ├── memory-chain.ts # RunnableWithMessageHistory (LCEL)
│ └── streaming.ts # iterateStream()
├── memory/
│ └── sqlite-history.ts # BaseChatMessageHistory → bun:sqlite
├── cache/
│ └── model-cache.ts # TTL cache for /models API
├── config/ # 6-level cascade resolver
├── output/ # envelope, renderer, pretty, table
├── errors/ # CliError + exit codes
└── types/ # Zod schemas
- Fork the repository
- Create a branch:
git checkout -b feat/your-feature - Make your changes and run
bun run lint - Open a Pull Request
MIT © 2026