AI assistant: in-product agentic chat (merge-ready)#2
Merged
Merged
Conversation
Add the built-in AI assistant (the dashboard's AI section): a native Vue agentic chat that operates the instance end-to-end through the same tools the MCP server exposes, dispatched in-process. BYO provider keys via the embedded Bifrost LLM gateway, with per-conversation approval gating and a code-enforced confirm=true gate on destructive tools. Backend (backend/internal/ai/*, ai_handler.go, ai_chat.go, agent_registry.go): - Manager + ai/llm (Bifrost) + ai/agent loop; SSE chat/approval stream - Conversation CRUD, regenerate, edit/delete (truncate-from-seq), tool approve/reject, provider configs (keys encrypted at rest), settings - Shared regAddTool registry feeds both the MCP server and the agent Frontend (views/AI.vue, stores/ai.js, components/ai/*): - Streaming chat with live thinking, tool-call cards, code/markdown rendering, regenerate/edit/retry/delete, conversation rail, export - Index-based streaming reactivity fix; typing indicator; seamless composer - Design-system polish pass across views, shared components, focus/a11y Docs: document the AI subsystem across ARCHITECTURE.md (component map, schema, responsibilities), API.md (/api/v1/ai/*), SECURITY.md, README, DESIGN.md (--color-link), and backend/frontend CLAUDE.md.
The "No AI provider configured" banner (and the composer's disabled "Configure a provider" state) was gated on `!providers.length`, which is an empty array on every mount until `GET /ai/providers` resolves — so the warning flashed for a frame on each page load even when a provider was configured. Add a `providersLoaded` flag (set in loadProviders' finally, true even on error) and gate the banner + composer disabled state on it. Until the first fetch settles the chat sits in a neutral state (no banner, neutral placeholder, "No model" cue); Send stays guarded by modelReady so nothing sends prematurely. Rebuilt + re-embedded UI.
Independent review of the AI assistant (PR #1) flagged a short list of confirmed correctness/security issues before merge to main. This closes them and adds the missing tests. Blockers: - seq race: InsertMessage now assigns ai_messages.seq atomically inside the INSERT (MAX+1 subquery) instead of a racy SELECT-then-INSERT - per-conversation lock in ai.Manager (tryLockConv) serializes turns; overlapping turns are rejected (SSE error / 409 ErrConversationBusy) - admin-gate: all /api/v1/ai/* now require the admin permission (requiredPermission); this is also the IDOR resolution under the chosen admin-only shared-conversation model - gateway leak: Server.Shutdown now calls Manager.Close() (Bifrost pools) Hardening: - agent loop bails on client disconnect (ctx.Err() per iteration) so a dropped connection no longer keeps billing the provider / dispatching auto-approved tools - slog logging where AI errors were silently swallowed; hasPendingForMessage treats a DB error as still-pending instead of advancing past a gate - wire up tool-call timing fields (started/finished/duration_ms) - frontend: abort the in-flight stream + reset curIdx when switching/creating conversations (mid-stream race); remove dead tokenCount; index-safe keys Tests: - test_ai_edit.py: edit/delete/regenerate truncate-tail semantics (new) - test_ai_chat.py: assert provider key never leaks via chat SSE / detail - ai/manager_test.go: per-conversation lock primitive Docs: SECURITY/ARCHITECTURE/backend CLAUDE updated (admin-only, shared space, lock + atomic seq + Close, base-URL SSRF accepted as homelab).
…egression test Two items surfaced by the independent final review: - agent loop: recheck ctx.Err() at the top of runToolCall so a client disconnect also stops the remaining auto-approved tools within a single multi-tool turn (the per-iteration check already stopped the next LLM call). - test_ai_perms.py: assert a read-only key gets 403 on GET /ai/* routes (locks the admin-gate-on-reads regression) + refresh the now-stale write-gate comments to reflect the admin gate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the in-product AI assistant (the dashboard's AI section): a native Vue agentic chat that operates the instance end-to-end through the same tools the MCP server exposes, dispatched in-process. BYO provider keys via the embedded Bifrost gateway, with per-conversation approval gating and a code-enforced
confirm=truegate on destructive tools. Includes the MCP shared-registry refactor (one declaration feeds both the external MCP server and the agent), a full e2e test suite, dashboard polish, and docs.Three commits:
feat(ai)— the assistant (backendinternal/ai/*, frontendviews/AI.vue+components/ai/*), shared registry, e2e suite, docs.fix(ai)— no-provider banner flicker on load.fix(ai)— merge-readiness: review blockers + hardening (below).Merge readiness
An independent multi-dimension review scored the branch 7.2/10 ("solid, merge after a short blocker list"). All blockers + high-value hardening are resolved:
Blockers
ai_messages.seqassigned atomically inside the INSERT (was a racy SELECT-then-INSERT); a per-conversation lock serializes turns (overlap → SSE error /409 CONVERSATION_BUSY)./api/v1/ai/*routes require theadminpermission (non-admin key → 403, admin → 200). This is the IDOR resolution under the admin-only, shared operator-space model.Server.Shutdowncloses the LLM gateway (Bifrost pools).Hardening
sloglogging where AI errors were swallowed;hasPendingForMessagefails safe.Tests added
test_ai_edit.py— edit/delete/regenerate truncate-tail semantics.test_ai_chat.py— provider key never leaks via chat SSE / conversation detail.ai/manager_test.go— per-conversation lock primitive.Verification
go build/vetclean;go test -race ./...green (server, database, mcp, ai).Accepted by design
base_urlis unrestricted — local endpoints (Ollama, LM Studio) are a first-class homelab use case and provider config is admin-only.Known minor follow-ups (non-blocking)
GetMessageErrNoRowsmapping, SSE tail-buffer drain ondone, agent-registry test breadth, token-usage UI surfacing.