add multi-backend local engine; deprecate OLLAMA_* (breaking) by cjus · Pull Request #23 · cjus/solrac

cjus · 2026-05-15T23:36:38Z

Summary

Replaces the Ollama-specific engine path with a generic local engine fronted by a LocalDriver interface and two implementations: Ollama (NDJSON /api/chat) and LMStudio (SSE /v1/chat/completions). Hard cutover — every OLLAMA_* env var, engine: ollama / tier: ollama frontmatter value, and /clear ollama / > / o alias is hard-rejected at boot or parse time with a rename hint. Audit row tag becomes three-segment local:<backend>:<modelId>, mirroring claude:<tier>:<modelId>.

Why. Solrac is a local-first deploy; the engine layer needs to support more than one local backend without abstraction debt. LMStudio joins as a first-class backend, the migration runway covers both directions, and operator-facing surfaces (env, frontmatter, slash commands, web UI label) all reflect the new shape consistently.

Breaking changes

Env vars. All OLLAMA_* → LOCAL_*. New required LOCAL_BACKEND (ollama | lmstudio) when LOCAL_ENABLED=true. LOCAL_URL default is backend-aware (:11434 Ollama, :1234 LMStudio). Boot fails loud on legacy OLLAMA_* keys and SOLRAC_DEFAULT_ENGINE=ollama.
Audit model format. ollama:<m> → local:<backend>:<m>. Idempotent retag migration at boot — order is load-bearing (retag before column rename). Dual-pattern reads (local:% + ollama:%) in outOfBandForEngine + hasLocalTurnsSince keep cross-engine queries correct during the rollback window. Legacy clause drop scheduled for the next release.
Schema. sessions.ollama_cutoff_ms → sessions.local_cutoff_ms via ALTER TABLE … RENAME COLUMN.
Slash commands. /clear ollama / /clear > / /clear o → /clear local (alias: l). /status line "ollama turns (24h)" → "local turns (24h)".
Operator markdown. tasks/*.md engine: ollama and skills/*.md tier: ollama hard-rejected at parse with rename hints.
Web UI label. local (<backend>) — e.g. local (ollama), local (lmstudio).
Thinking-stub emoji. 🦙 → 💻 (backend-neutral).

Driver hardening (Ollama + LMStudio parity)

LMStudio: parallel_tool_calls: false + identical-(name, args) dedup (Gemma-4 lmstudio-bug-tracker #1756 workaround). Tool-call arguments delta accumulation across SSE chunks. usage capture whether inline or trailing.
LMStudio silent-substitution detection. LMStudio's OpenAI-compatible endpoint returns 200 OK with the loaded model when the requested id isn't loaded. Caught during the live smoke run; driver now compares chunk.model (case-insensitive) against the requested model on the first chunk that carries it, throws model_missing with the served-model id surfaced + lms load <requested> hint. Closes the mid-session hole that boot-time probe() doesn't cover.
Ollama driver instanceof LocalDriverError guard in stream-catch for symmetry with LMStudio — future defensive throws inside the stream loop won't get clobbered by the generic unreachable wrap.

Post-review hardening

LOCAL_* scrubbed from SDK subprocess env (agent.ts::sanitizedSubprocessEnv). LOCAL_URL in particular could leak internal network topology via auto-allowed Bash(echo $LOCAL_URL).
/clear ollama / /clear o / /clear > now returns an explicit → use /clear local hint instead of silent "Unknown command".
audit.tool_calls capped at 64KB (AUDIT_TOOL_CALLS_MAX_LEN) in db.ts::updateAuditEnd — defends against runaway local-model arg blobs (8 iterations × hallucinated 100KB args = potential MB-sized audit rows). Centralized so all audit writers (Claude SDK, local engine, skills) get the protection.

Test plan

Follow-ups (next release)

Drop the legacy ollama:% dual-pattern clause in outOfBandForEngine + hasLocalTurnsSince after one release cycle. Removes the dual-LIKE clauses from operator SQL examples in docs/OPERATIONS.md, docs/SCHEMA.md, docs/RUNBOOK.md.
Add explicit sanitizedSubprocessEnv test coverage — src/agent.test.ts doesn't exist yet and none of the scrub lines (incl. NOTION_API_KEY, STATS_BEARER_TOKEN) have direct coverage today.

Anti-goals

No reversals. No SDK pin bump. No new runtime deps.

replace the ollama-specific engine path with a generic `local` engine fronted by a driver interface and two implementations: ollama (NDJSON /api/chat) and lmstudio (SSE /v1/chat/completions). hard cutover — every OLLAMA_* env var, `engine: ollama` / `tier: ollama` frontmatter value, and `/clear ollama`/`>`/`o` alias is rejected at boot or parse time with a rename hint. key changes: - audit model column: `ollama:<m>` → `local:<backend>:<m>` (idempotent retag migration at boot; load-bearing order — retag before sessions column rename) - sessions.ollama_cutoff_ms → sessions.local_cutoff_ms via RENAME COLUMN - dual-pattern reads (local:% + ollama:%) for one release cycle in outOfBandForEngine + hasLocalTurnsSince - LOCAL_BACKEND required when LOCAL_ENABLED=true; URL default is backend-aware - web UI pill label: `local (<backend>)` - thinking-stub emoji: 🦙 → 💻 (backend-neutral) - lmstudio driver: parallel_tool_calls=false + identical-(name,args) dedup (gemma-4 lmstudio-bug-tracker #1756 workaround), arg-delta accumulation across SSE chunks, usage chunk capture (inline or trailing) post-review hardening: - lmstudio silent-substitution detection: chunk.model mismatch (case-insensitive) throws model_missing with served-model id surfaced + `lms load` hint. closes mid-session hole probe() didn't cover. - LOCAL_* scrubbed from SDK subprocess env (LOCAL_URL could leak network topology) - /clear ollama|o|> returns explicit rename hint instead of silent "unknown" - audit.tool_calls capped at 64KB to defend runaway local-model arg blobs - ollama driver stream-catch gains instanceof LocalDriverError guard for symmetry verification: typecheck clean, bun test 755/755 pass (+8 net new tests across local-driver, local-tools, local, db, commands). live smokes against ollama gemma4:e4b 21/21 and lmstudio gemma-4-31b-it-mlx 21/21 (pure + tools-on). migration snapshot verified on synthetic 250-row prod-like db with 84 legacy ollama:% rows + 2 sessions on the legacy column — first boot retags + renames, second boot is silent (idempotent). no SDK pin bump. no anti-goal reversals. pre-deploy: cp data/solrac.db data/solrac.db.pre-local-migration before service restart.

cjus merged commit cdb7fe5 into main May 15, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add multi-backend local engine; deprecate OLLAMA_* (breaking)#23

add multi-backend local engine; deprecate OLLAMA_* (breaking)#23
cjus merged 1 commit into
mainfrom
carlos/solrac-local-llm-backend

cjus commented May 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cjus commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Breaking changes

Driver hardening (Ollama + LMStudio parity)

Post-review hardening

Test plan

Follow-ups (next release)

Anti-goals

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cjus commented May 15, 2026 •

edited

Loading