Skip to content

AI assistant: in-product agentic chat (merge-ready)#1

Closed
Harsh-2002 wants to merge 3 commits into
mainfrom
dev
Closed

AI assistant: in-product agentic chat (merge-ready)#1
Harsh-2002 wants to merge 3 commits into
mainfrom
dev

Conversation

@Harsh-2002
Copy link
Copy Markdown
Owner

@Harsh-2002 Harsh-2002 commented Jun 2, 2026

Summary

Adds the in-product AI assistant (the dashboard's AI section): a native Vue agentic chat that operates the instance end-to-end through the same tools the MCP server exposes, dispatched in-process. BYO provider keys via the embedded Bifrost gateway, with per-conversation approval gating and a code-enforced confirm=true gate on destructive tools. Plus the supporting MCP shared-registry refactor, a full e2e test suite, dashboard polish, and docs.

Independent review → merge readiness

A multi-dimension independent review scored the initial branch 7.2/10 ("solid, merge after a short blocker list"). All blockers + high-value hardening are now fixed (commit 8f59b66):

Blockers

  • Data correctness: ai_messages.seq is now assigned atomically inside the INSERT (was a racy SELECT-then-INSERT) + a per-conversation lock serializes turns (overlapping turns rejected with SSE error / 409 CONVERSATION_BUSY).
  • Security: all /api/v1/ai/* routes now require the admin permission (verified: non-admin key → 403, admin → 200). This is also the IDOR resolution under the chosen admin-only, shared operator space model.
  • Resource leak: Server.Shutdown now closes the LLM gateway (Bifrost pools).

Hardening

  • Agent loop bails on client disconnect (no runaway billed calls / auto-tool dispatch).
  • slog logging where AI errors were swallowed; hasPendingForMessage fails safe.
  • Tool-call timing fields wired up.
  • Frontend: in-flight stream aborted on conversation switch (mid-stream race); dead code removed; index-safe keys.

Tests added

  • test_ai_edit.py — edit/delete/regenerate truncate-tail semantics (previously untested).
  • test_ai_chat.py — provider key never leaks via chat SSE / conversation detail.
  • ai/manager_test.go — per-conversation lock primitive.

Verification

  • go build/vet clean; go test -race ./... green (server, database, mcp, ai).
  • AI e2e suite: 120+ checks pass (chat, advanced, conversations, providers, perms, settings, edit).
  • Auth: read-only key → 403 on AI routes; admin → 200.
  • Frontend: lint 0 errors, build clean, no console errors; no provider-banner flicker.

Accepted by design (not a fix)

  • Provider base_url is intentionally unrestricted — local endpoints (Ollama, LM Studio) are a first-class homelab use case and provider config is admin-only.

Known minor follow-ups (non-blocking)

GetMessage ErrNoRows mapping, SSE tail-buffer drain on done, agent-registry test breadth, token-usage UI surfacing.

Add the built-in AI assistant (the dashboard's AI section): a native Vue
agentic chat that operates the instance end-to-end through the same tools
the MCP server exposes, dispatched in-process. BYO provider keys via the
embedded Bifrost LLM gateway, with per-conversation approval gating and a
code-enforced confirm=true gate on destructive tools.

Backend (backend/internal/ai/*, ai_handler.go, ai_chat.go, agent_registry.go):
- Manager + ai/llm (Bifrost) + ai/agent loop; SSE chat/approval stream
- Conversation CRUD, regenerate, edit/delete (truncate-from-seq), tool
  approve/reject, provider configs (keys encrypted at rest), settings
- Shared regAddTool registry feeds both the MCP server and the agent

Frontend (views/AI.vue, stores/ai.js, components/ai/*):
- Streaming chat with live thinking, tool-call cards, code/markdown
  rendering, regenerate/edit/retry/delete, conversation rail, export
- Index-based streaming reactivity fix; typing indicator; seamless composer
- Design-system polish pass across views, shared components, focus/a11y

Docs: document the AI subsystem across ARCHITECTURE.md (component map,
schema, responsibilities), API.md (/api/v1/ai/*), SECURITY.md, README,
DESIGN.md (--color-link), and backend/frontend CLAUDE.md.
The "No AI provider configured" banner (and the composer's disabled
"Configure a provider" state) was gated on `!providers.length`, which is
an empty array on every mount until `GET /ai/providers` resolves — so the
warning flashed for a frame on each page load even when a provider was
configured.

Add a `providersLoaded` flag (set in loadProviders' finally, true even on
error) and gate the banner + composer disabled state on it. Until the
first fetch settles the chat sits in a neutral state (no banner, neutral
placeholder, "No model" cue); Send stays guarded by modelReady so nothing
sends prematurely. Rebuilt + re-embedded UI.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 97106ea557

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread frontend/src/stores/ai.js
// re-run honours the model the user currently has selected.
function overrides() {
const o = { thinking: thinking.value }
if (selectedProvider.value) o.provider = selectedProvider.value
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use the selected provider config for chat

When more than one config exists for the same provider family (the schema supports this via UNIQUE(provider, label) and the UI lists rows by id), the chat override only sends p.provider (for example openai) and drops selectedProviderId. The backend then resolves that family with GetEnabledProviderConfig, which chooses the most recently updated enabled row, so selecting a different labelled endpoint/key can still send the prompt and model to another provider config; this especially breaks OpenAI-compatible/custom endpoints whose model list was loaded from the selected id but whose completion is routed elsewhere.

Useful? React with 👍 / 👎.

seen := map[string]bool{}
var out []string
for _, c := range cfgs {
if c.Enabled && c.APIKeyEncrypted != "" && !seen[c.Provider] {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include keyless local providers in the gateway config

For keyless providers such as Ollama, SaveProvider permits a config with only a base URL and Resolve explicitly treats an empty key as legitimate, but this filter omits those enabled configs from the LLM account's configured provider list. In that scenario the UI can save and select ollama, yet the gateway is initialized without that provider, so chats against the local endpoint fail before the base URL can be used.

Useful? React with 👍 / 👎.

Comment thread backend/internal/database/ai_chat.go Outdated
Comment on lines +239 to +244
if err := db.write.QueryRow(
`SELECT MAX(seq) FROM ai_messages WHERE conversation_id = ?`, m.ConversationID,
).Scan(&maxSeq); err != nil {
return err
}
m.Seq = int(maxSeq.Int64) + 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make message sequence allocation atomic

The MAX(seq)+1 read and the subsequent insert are separate database operations, so a pair of concurrent turns/tool resumes on the same conversation can both read the same max before either insert completes. A single SQLite write connection serializes individual statements, not this read-then-write pair, leaving duplicate seq values that make message ordering and DeleteMessagesFromSeq truncation ambiguous for that conversation.

Useful? React with 👍 / 👎.

Independent review of the AI assistant (PR #1) flagged a short list of
confirmed correctness/security issues before merge to main. This closes
them and adds the missing tests.

Blockers:
- seq race: InsertMessage now assigns ai_messages.seq atomically inside
  the INSERT (MAX+1 subquery) instead of a racy SELECT-then-INSERT
- per-conversation lock in ai.Manager (tryLockConv) serializes turns;
  overlapping turns are rejected (SSE error / 409 ErrConversationBusy)
- admin-gate: all /api/v1/ai/* now require the admin permission
  (requiredPermission); this is also the IDOR resolution under the
  chosen admin-only shared-conversation model
- gateway leak: Server.Shutdown now calls Manager.Close() (Bifrost pools)

Hardening:
- agent loop bails on client disconnect (ctx.Err() per iteration) so a
  dropped connection no longer keeps billing the provider / dispatching
  auto-approved tools
- slog logging where AI errors were silently swallowed; hasPendingForMessage
  treats a DB error as still-pending instead of advancing past a gate
- wire up tool-call timing fields (started/finished/duration_ms)
- frontend: abort the in-flight stream + reset curIdx when switching/creating
  conversations (mid-stream race); remove dead tokenCount; index-safe keys

Tests:
- test_ai_edit.py: edit/delete/regenerate truncate-tail semantics (new)
- test_ai_chat.py: assert provider key never leaks via chat SSE / detail
- ai/manager_test.go: per-conversation lock primitive

Docs: SECURITY/ARCHITECTURE/backend CLAUDE updated (admin-only, shared
space, lock + atomic seq + Close, base-URL SSRF accepted as homelab).
@Harsh-2002 Harsh-2002 changed the title Dev AI assistant: in-product agentic chat (merge-ready) Jun 2, 2026
@Harsh-2002 Harsh-2002 closed this Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant