Skip to content

AI assistant: in-product agentic chat (merge-ready)#2

Merged
Harsh-2002 merged 4 commits into
mainfrom
dev
Jun 2, 2026
Merged

AI assistant: in-product agentic chat (merge-ready)#2
Harsh-2002 merged 4 commits into
mainfrom
dev

Conversation

@Harsh-2002
Copy link
Copy Markdown
Owner

Summary

Adds the in-product AI assistant (the dashboard's AI section): a native Vue agentic chat that operates the instance end-to-end through the same tools the MCP server exposes, dispatched in-process. BYO provider keys via the embedded Bifrost gateway, with per-conversation approval gating and a code-enforced confirm=true gate on destructive tools. Includes the MCP shared-registry refactor (one declaration feeds both the external MCP server and the agent), a full e2e test suite, dashboard polish, and docs.

Three commits:

  • feat(ai) — the assistant (backend internal/ai/*, frontend views/AI.vue + components/ai/*), shared registry, e2e suite, docs.
  • fix(ai) — no-provider banner flicker on load.
  • fix(ai) — merge-readiness: review blockers + hardening (below).

Merge readiness

An independent multi-dimension review scored the branch 7.2/10 ("solid, merge after a short blocker list"). All blockers + high-value hardening are resolved:

Blockers

  • Data correctness: ai_messages.seq assigned atomically inside the INSERT (was a racy SELECT-then-INSERT); a per-conversation lock serializes turns (overlap → SSE error / 409 CONVERSATION_BUSY).
  • Security: all /api/v1/ai/* routes require the admin permission (non-admin key → 403, admin → 200). This is the IDOR resolution under the admin-only, shared operator-space model.
  • Resource leak: Server.Shutdown closes the LLM gateway (Bifrost pools).

Hardening

  • Agent loop bails on client disconnect (no runaway billed calls / auto-tool dispatch).
  • slog logging where AI errors were swallowed; hasPendingForMessage fails safe.
  • Tool-call timing fields wired up.
  • Frontend: in-flight stream aborted on conversation switch (mid-stream race); dead code removed; index-safe keys.

Tests added

  • test_ai_edit.py — edit/delete/regenerate truncate-tail semantics.
  • test_ai_chat.py — provider key never leaks via chat SSE / conversation detail.
  • ai/manager_test.go — per-conversation lock primitive.

Verification

  • go build/vet clean; go test -race ./... green (server, database, mcp, ai).
  • AI e2e suite: 120+ checks pass (chat, advanced, conversations, providers, perms, settings, edit).
  • Auth: read-only key → 403 on AI routes; admin → 200.
  • Frontend: lint 0 errors, build clean, no console errors; no provider-banner flicker.

Accepted by design

  • Provider base_url is unrestricted — local endpoints (Ollama, LM Studio) are a first-class homelab use case and provider config is admin-only.

Known minor follow-ups (non-blocking)

GetMessage ErrNoRows mapping, SSE tail-buffer drain on done, agent-registry test breadth, token-usage UI surfacing.

Add the built-in AI assistant (the dashboard's AI section): a native Vue
agentic chat that operates the instance end-to-end through the same tools
the MCP server exposes, dispatched in-process. BYO provider keys via the
embedded Bifrost LLM gateway, with per-conversation approval gating and a
code-enforced confirm=true gate on destructive tools.

Backend (backend/internal/ai/*, ai_handler.go, ai_chat.go, agent_registry.go):
- Manager + ai/llm (Bifrost) + ai/agent loop; SSE chat/approval stream
- Conversation CRUD, regenerate, edit/delete (truncate-from-seq), tool
  approve/reject, provider configs (keys encrypted at rest), settings
- Shared regAddTool registry feeds both the MCP server and the agent

Frontend (views/AI.vue, stores/ai.js, components/ai/*):
- Streaming chat with live thinking, tool-call cards, code/markdown
  rendering, regenerate/edit/retry/delete, conversation rail, export
- Index-based streaming reactivity fix; typing indicator; seamless composer
- Design-system polish pass across views, shared components, focus/a11y

Docs: document the AI subsystem across ARCHITECTURE.md (component map,
schema, responsibilities), API.md (/api/v1/ai/*), SECURITY.md, README,
DESIGN.md (--color-link), and backend/frontend CLAUDE.md.
The "No AI provider configured" banner (and the composer's disabled
"Configure a provider" state) was gated on `!providers.length`, which is
an empty array on every mount until `GET /ai/providers` resolves — so the
warning flashed for a frame on each page load even when a provider was
configured.

Add a `providersLoaded` flag (set in loadProviders' finally, true even on
error) and gate the banner + composer disabled state on it. Until the
first fetch settles the chat sits in a neutral state (no banner, neutral
placeholder, "No model" cue); Send stays guarded by modelReady so nothing
sends prematurely. Rebuilt + re-embedded UI.
Independent review of the AI assistant (PR #1) flagged a short list of
confirmed correctness/security issues before merge to main. This closes
them and adds the missing tests.

Blockers:
- seq race: InsertMessage now assigns ai_messages.seq atomically inside
  the INSERT (MAX+1 subquery) instead of a racy SELECT-then-INSERT
- per-conversation lock in ai.Manager (tryLockConv) serializes turns;
  overlapping turns are rejected (SSE error / 409 ErrConversationBusy)
- admin-gate: all /api/v1/ai/* now require the admin permission
  (requiredPermission); this is also the IDOR resolution under the
  chosen admin-only shared-conversation model
- gateway leak: Server.Shutdown now calls Manager.Close() (Bifrost pools)

Hardening:
- agent loop bails on client disconnect (ctx.Err() per iteration) so a
  dropped connection no longer keeps billing the provider / dispatching
  auto-approved tools
- slog logging where AI errors were silently swallowed; hasPendingForMessage
  treats a DB error as still-pending instead of advancing past a gate
- wire up tool-call timing fields (started/finished/duration_ms)
- frontend: abort the in-flight stream + reset curIdx when switching/creating
  conversations (mid-stream race); remove dead tokenCount; index-safe keys

Tests:
- test_ai_edit.py: edit/delete/regenerate truncate-tail semantics (new)
- test_ai_chat.py: assert provider key never leaks via chat SSE / detail
- ai/manager_test.go: per-conversation lock primitive

Docs: SECURITY/ARCHITECTURE/backend CLAUDE updated (admin-only, shared
space, lock + atomic seq + Close, base-URL SSRF accepted as homelab).
…egression test

Two items surfaced by the independent final review:
- agent loop: recheck ctx.Err() at the top of runToolCall so a client
  disconnect also stops the remaining auto-approved tools within a single
  multi-tool turn (the per-iteration check already stopped the next LLM call).
- test_ai_perms.py: assert a read-only key gets 403 on GET /ai/* routes
  (locks the admin-gate-on-reads regression) + refresh the now-stale
  write-gate comments to reflect the admin gate.
@Harsh-2002 Harsh-2002 merged commit 5d917db into main Jun 2, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant