AI assistant: in-product agentic chat (merge-ready)#1
Conversation
Add the built-in AI assistant (the dashboard's AI section): a native Vue agentic chat that operates the instance end-to-end through the same tools the MCP server exposes, dispatched in-process. BYO provider keys via the embedded Bifrost LLM gateway, with per-conversation approval gating and a code-enforced confirm=true gate on destructive tools. Backend (backend/internal/ai/*, ai_handler.go, ai_chat.go, agent_registry.go): - Manager + ai/llm (Bifrost) + ai/agent loop; SSE chat/approval stream - Conversation CRUD, regenerate, edit/delete (truncate-from-seq), tool approve/reject, provider configs (keys encrypted at rest), settings - Shared regAddTool registry feeds both the MCP server and the agent Frontend (views/AI.vue, stores/ai.js, components/ai/*): - Streaming chat with live thinking, tool-call cards, code/markdown rendering, regenerate/edit/retry/delete, conversation rail, export - Index-based streaming reactivity fix; typing indicator; seamless composer - Design-system polish pass across views, shared components, focus/a11y Docs: document the AI subsystem across ARCHITECTURE.md (component map, schema, responsibilities), API.md (/api/v1/ai/*), SECURITY.md, README, DESIGN.md (--color-link), and backend/frontend CLAUDE.md.
The "No AI provider configured" banner (and the composer's disabled "Configure a provider" state) was gated on `!providers.length`, which is an empty array on every mount until `GET /ai/providers` resolves — so the warning flashed for a frame on each page load even when a provider was configured. Add a `providersLoaded` flag (set in loadProviders' finally, true even on error) and gate the banner + composer disabled state on it. Until the first fetch settles the chat sits in a neutral state (no banner, neutral placeholder, "No model" cue); Send stays guarded by modelReady so nothing sends prematurely. Rebuilt + re-embedded UI.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 97106ea557
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // re-run honours the model the user currently has selected. | ||
| function overrides() { | ||
| const o = { thinking: thinking.value } | ||
| if (selectedProvider.value) o.provider = selectedProvider.value |
There was a problem hiding this comment.
Use the selected provider config for chat
When more than one config exists for the same provider family (the schema supports this via UNIQUE(provider, label) and the UI lists rows by id), the chat override only sends p.provider (for example openai) and drops selectedProviderId. The backend then resolves that family with GetEnabledProviderConfig, which chooses the most recently updated enabled row, so selecting a different labelled endpoint/key can still send the prompt and model to another provider config; this especially breaks OpenAI-compatible/custom endpoints whose model list was loaded from the selected id but whose completion is routed elsewhere.
Useful? React with 👍 / 👎.
| seen := map[string]bool{} | ||
| var out []string | ||
| for _, c := range cfgs { | ||
| if c.Enabled && c.APIKeyEncrypted != "" && !seen[c.Provider] { |
There was a problem hiding this comment.
Include keyless local providers in the gateway config
For keyless providers such as Ollama, SaveProvider permits a config with only a base URL and Resolve explicitly treats an empty key as legitimate, but this filter omits those enabled configs from the LLM account's configured provider list. In that scenario the UI can save and select ollama, yet the gateway is initialized without that provider, so chats against the local endpoint fail before the base URL can be used.
Useful? React with 👍 / 👎.
| if err := db.write.QueryRow( | ||
| `SELECT MAX(seq) FROM ai_messages WHERE conversation_id = ?`, m.ConversationID, | ||
| ).Scan(&maxSeq); err != nil { | ||
| return err | ||
| } | ||
| m.Seq = int(maxSeq.Int64) + 1 |
There was a problem hiding this comment.
Make message sequence allocation atomic
The MAX(seq)+1 read and the subsequent insert are separate database operations, so a pair of concurrent turns/tool resumes on the same conversation can both read the same max before either insert completes. A single SQLite write connection serializes individual statements, not this read-then-write pair, leaving duplicate seq values that make message ordering and DeleteMessagesFromSeq truncation ambiguous for that conversation.
Useful? React with 👍 / 👎.
Independent review of the AI assistant (PR #1) flagged a short list of confirmed correctness/security issues before merge to main. This closes them and adds the missing tests. Blockers: - seq race: InsertMessage now assigns ai_messages.seq atomically inside the INSERT (MAX+1 subquery) instead of a racy SELECT-then-INSERT - per-conversation lock in ai.Manager (tryLockConv) serializes turns; overlapping turns are rejected (SSE error / 409 ErrConversationBusy) - admin-gate: all /api/v1/ai/* now require the admin permission (requiredPermission); this is also the IDOR resolution under the chosen admin-only shared-conversation model - gateway leak: Server.Shutdown now calls Manager.Close() (Bifrost pools) Hardening: - agent loop bails on client disconnect (ctx.Err() per iteration) so a dropped connection no longer keeps billing the provider / dispatching auto-approved tools - slog logging where AI errors were silently swallowed; hasPendingForMessage treats a DB error as still-pending instead of advancing past a gate - wire up tool-call timing fields (started/finished/duration_ms) - frontend: abort the in-flight stream + reset curIdx when switching/creating conversations (mid-stream race); remove dead tokenCount; index-safe keys Tests: - test_ai_edit.py: edit/delete/regenerate truncate-tail semantics (new) - test_ai_chat.py: assert provider key never leaks via chat SSE / detail - ai/manager_test.go: per-conversation lock primitive Docs: SECURITY/ARCHITECTURE/backend CLAUDE updated (admin-only, shared space, lock + atomic seq + Close, base-URL SSRF accepted as homelab).
Summary
Adds the in-product AI assistant (the dashboard's AI section): a native Vue agentic chat that operates the instance end-to-end through the same tools the MCP server exposes, dispatched in-process. BYO provider keys via the embedded Bifrost gateway, with per-conversation approval gating and a code-enforced
confirm=truegate on destructive tools. Plus the supporting MCP shared-registry refactor, a full e2e test suite, dashboard polish, and docs.Independent review → merge readiness
A multi-dimension independent review scored the initial branch 7.2/10 ("solid, merge after a short blocker list"). All blockers + high-value hardening are now fixed (commit
8f59b66):Blockers
ai_messages.seqis now assigned atomically inside the INSERT (was a racy SELECT-then-INSERT) + a per-conversation lock serializes turns (overlapping turns rejected with SSE error /409 CONVERSATION_BUSY)./api/v1/ai/*routes now require theadminpermission (verified: non-admin key → 403, admin → 200). This is also the IDOR resolution under the chosen admin-only, shared operator space model.Server.Shutdownnow closes the LLM gateway (Bifrost pools).Hardening
sloglogging where AI errors were swallowed;hasPendingForMessagefails safe.Tests added
test_ai_edit.py— edit/delete/regenerate truncate-tail semantics (previously untested).test_ai_chat.py— provider key never leaks via chat SSE / conversation detail.ai/manager_test.go— per-conversation lock primitive.Verification
go build/vetclean;go test -race ./...green (server, database, mcp, ai).Accepted by design (not a fix)
base_urlis intentionally unrestricted — local endpoints (Ollama, LM Studio) are a first-class homelab use case and provider config is admin-only.Known minor follow-ups (non-blocking)
GetMessageErrNoRowsmapping, SSE tail-buffer drain ondone, agent-registry test breadth, token-usage UI surfacing.