Skip to content

feat: make GLM 5.2 the default Ollama Cloud model#175

Merged
ankitvgupta merged 2 commits into
mainfrom
ankitvgupta/kimi-k2-ollama-default
Jun 16, 2026
Merged

feat: make GLM 5.2 the default Ollama Cloud model#175
ankitvgupta merged 2 commits into
mainfrom
ankitvgupta/kimi-k2-ollama-default

Conversation

@ankitvgupta

@ankitvgupta ankitvgupta commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Summary

Switches the default Ollama Cloud model (DEFAULT_OLLAMA_MODEL) from kimi-k2.6:cloud to glm-5.2:cloud (z.ai, 744B MoE, 1M context — launched June 2026, live on Ollama Cloud).

This is the single source of truth for the Ollama default: it flows to the agent worker (resolveAgentOllamaConfig), per-feature resolution (getFeatureModelConfig), the OllamaCloudConfigSchema default, the Setup Wizard, and the Extensions/Settings model-selector placeholders. No other code changes are required — the UI selectors are free-text inputs that reference the constant.

Why

Evaluated against the prior candidates with a 16-task agent-sidebar benchmark run through the real app (test inbox, real LLM calls), interleaved per-task (both models back-to-back, alternating order), warm, with auto-draft disabled so the single-threaded agent worker was dedicated.

Speed (within-run; absolute times vary across sessions from cloud load, so only within-run comparisons are valid):

  • GLM 5.2 vs kimi-k2.7-code: GLM faster on 10/16 then 13/16 across two runs; median 23.4s vs 36.0s in the corrected run. Both 16/16 completed.
  • (For reference, an earlier run established kimi-k2.7-code ≈2× faster than the old kimi-k2.6 default, so GLM 5.2 ≥ k2.7 ≫ k2.6 on speed.)

Quality: parity on final-turn answers (GLM slightly more thorough on extraction). Both produce natural, grounded drafts.

Integration correctness: verified via a direct compat-endpoint probe that both models emit reasoning in proper thinking blocks on Ollama's /v1/messages — chain-of-thought is not leaking into drafts. (An initial benchmark wrongly penalized GLM because the harness concatenated assistant text across all agent turns, counting GLM's brief inter-turn progress lines as "verbosity"; fixed to score the final answer only.)

Changes

  • src/shared/types.ts: DEFAULT_OLLAMA_MODEL = "glm-5.2:cloud" + updated rationale comment.

Scope & caveats

  • The rigorous benchmarking was on the agent sidebar (chat + drafter). Non-agent Ollama features (analysis, draft-generator, calendaring, archive-ready, sender-lookup) inherit this default; they run via the native /api/chat path with think:true, which keeps CoT out of parsed JSON the same way kimi-k2.6 did — but they were not separately benchmarked. A follow-up eval of those features on GLM 5.2 is worth doing.
  • GLM emits brief inter-turn progress text in agent loops (shows as status lines in the sidebar). It's harmless and suppressible via the agent system prompt if we decide it's noise.
  • Ollama Cloud usage is opt-in and $0 to the user (subscription); this default only applies once a user routes a feature to Ollama.

🤖 Generated with Claude Code


Open in Devin Review

Pre-PR verdict: PASS

  • mode: full
  • sha: 226806d
  • generated: 2026-06-16T21:21:58.464Z
Phase Status Duration
eval:analyzer ✅ exit 0 13.8s
eval:features ✅ exit 0 31.5s
agentic-verify ✅ exit 0 180.2s
real-gmail:cached ✅ exit 0 8.5s

Switch DEFAULT_OLLAMA_MODEL from kimi-k2.6:cloud to glm-5.2:cloud.

Validated with a 16-task agent-sidebar benchmark (real test inbox, interleaved,
warm, auto-draft disabled) against kimi-k2.7-code:cloud. GLM 5.2 was the faster
model in both runs (faster on 10/16 then 13/16) at parity final-answer quality.
Both models clearly beat the prior kimi-k2.6 default. Reasoning is returned in
proper `thinking` blocks on Ollama's Anthropic-compat endpoint, so chain-of-
thought does not leak into drafts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown

Greptile Summary

This PR changes the default Ollama Cloud model from kimi-k2.6:cloud to glm-5.2:cloud (z.ai, 744B MoE, 1M context), backed by a 16-task agent-sidebar benchmark showing speed improvements at equivalent quality. It also repairs the eval harness in four test files, which were constructing service instances without a model argument and therefore falling back to the hardcoded legacy default claude-sonnet-4-20250514 — a retired model that now returns 404.

  • src/shared/types.ts: Single-constant change updating DEFAULT_OLLAMA_MODEL and expanding the rationale comment; the new value flows automatically to OllamaCloudConfigSchema, resolveAgentOllamaConfig, getFeatureModelConfig, the Setup Wizard, and the model-selector placeholders.
  • Eval harness fixes (runner.ts, archive-ready-analyzer.ts, calendaring-agent.ts, draft-generator.ts): Services are now constructed with resolveModelId(DEFAULT_MODEL_CONFIG.<feature>) so evals always exercise the same Anthropic model tier the app ships with, preventing silent 404 failures on retired model IDs.

Confidence Score: 5/5

Safe to merge — the change is a single constant string swap in a well-isolated location, and the eval fixes are correct improvements that prevent silent failures on retired model IDs.

The core change is a one-line constant update that flows automatically to all callsites. The eval harness fixes are straightforward and correct — each service constructor is now given the live model ID rather than the retired hardcoded default. No production logic paths are altered beyond the new default model name.

No files require special attention. All changes are narrow, well-commented, and consistent with the existing pattern.

Important Files Changed

Filename Overview
src/shared/types.ts Single-constant update: DEFAULT_OLLAMA_MODEL changed from kimi-k2.6:cloud to glm-5.2:cloud with an expanded rationale comment. Change is isolated and flows to all dependent callsites automatically.
tests/evals/runner.ts Fixes a latent bug: EmailAnalyzer is now constructed with the resolved current-tier model ID instead of the retired claude-sonnet-4-20250514 default that causes 404s.
tests/evals/features/archive-ready-analyzer.ts Same eval-harness fix as runner.ts applied to ArchiveReadyAnalyzer; model now resolved from DEFAULT_MODEL_CONFIG.archiveReady.
tests/evals/features/calendaring-agent.ts Same eval-harness fix applied to CalendaringAgent; model resolved from DEFAULT_MODEL_CONFIG.calendaring.
tests/evals/features/draft-generator.ts Same eval-harness fix applied to DraftGenerator; model resolved from DEFAULT_MODEL_CONFIG.drafts.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["DEFAULT_OLLAMA_MODEL = glm-5.2:cloud"] --> B[OllamaCloudConfigSchema defaultModel]
    A --> C[resolveAgentOllamaConfig]
    A --> D[getFeatureModelConfig]
    A --> E[Setup Wizard placeholder]
    A --> F[Model-selector placeholder]

    G["DEFAULT_MODEL_CONFIG + resolveModelId"] --> H[EmailAnalyzer eval]
    G --> I[ArchiveReadyAnalyzer eval]
    G --> J[CalendaringAgent eval]
    G --> K[DraftGenerator eval]

    C --> L[Agent path: /v1/messages thinking blocks]
    D --> M[Non-agent path: /api/chat think=true]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A["DEFAULT_OLLAMA_MODEL = glm-5.2:cloud"] --> B[OllamaCloudConfigSchema defaultModel]
    A --> C[resolveAgentOllamaConfig]
    A --> D[getFeatureModelConfig]
    A --> E[Setup Wizard placeholder]
    A --> F[Model-selector placeholder]

    G["DEFAULT_MODEL_CONFIG + resolveModelId"] --> H[EmailAnalyzer eval]
    G --> I[ArchiveReadyAnalyzer eval]
    G --> J[CalendaringAgent eval]
    G --> K[DraftGenerator eval]

    C --> L[Agent path: /v1/messages thinking blocks]
    D --> M[Non-agent path: /api/chat think=true]
Loading

Reviews (2): Last reviewed commit: "fix(evals): eval harness used retired de..." | Re-trigger Greptile

…20250514)

The eval runners constructed services with no model argument, falling back to
the retired legacy constructor default `claude-sonnet-4-20250514`, which now
404s — breaking `npm run eval` (and the pre-pr eval gate) for everyone. The app
itself is unaffected because it passes the resolved live model.

Pass each eval service the app's actual default model via
resolveModelId(DEFAULT_MODEL_CONFIG[feature]) so the eval tracks real default
resolution and can't rot when a model id is retired:
- runner.ts (analyzer)
- features/draft-generator.ts
- features/calendaring-agent.ts
- features/archive-ready-analyzer.ts

Verified: analyzer eval 10/10 no regressions; feature evals pass with +0.0
deltas vs baseline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ankitvgupta

Copy link
Copy Markdown
Owner Author

✅ Pre-PR verification — PASS

  • mode: full
  • sha: 226806d
  • generated: 2026-06-16T21:22:00.802Z
Phase Status Duration
eval:analyzer ✅ exit 0 13.8s
eval:features ✅ exit 0 31.5s
agentic-verify ✅ exit 0 180.2s
real-gmail:cached ✅ exit 0 8.5s
Agentic verification — summary

Agentic verification — verify-diff

  • SHA: 226806d
  • Verdict: pass
  • Anomalies: 0
  • Actions: 21 (ToolSearch×3, mcp__chrome-devtools__list_pages×1, mcp__chrome-devtools__select_page×1, mcp__chrome-devtools__take_snapshot×3, mcp__chrome-devtools__take_screenshot×4, mcp__chrome-devtools__evaluate_script×4, mcp__chrome-devtools__click×4, mcp__chrome-devtools__press_key×1)
  • Cost: $0.6222
  • Turns: 22

Summary

PR makes two categories of changes: (1) switches DEFAULT_OLLAMA_MODEL from kimi-k2.6:cloud to glm-5.2:cloud in types.ts — only affects Ollama Cloud users, invisible in Anthropic/demo mode; (2) updates eval test constructors to pass explicit model IDs instead of relying on no-arg defaults — test infrastructure change with no UI impact. Verified: app loads cleanly, email detail view opens with draft content and sender profile panel, Settings panel with all tabs (General AI Models, Agents) opens correctly, back-navigation via Escape returns to inbox list with expected sticky selection state. No JS errors, no layout breaks, no stuck states observed.

Agentic verification — literal trace
[2026-06-16T21:18:49.888Z] Auto-selected CDP port: 9223
[2026-06-16T21:18:49.888Z] mode=verify-diff sha=226806d action_budget=40 budget_usd=1.5
[2026-06-16T21:18:49.962Z] data mode: demo (diff is UI/scripts/tests only)
[2026-06-16T21:18:49.962Z] diff base=44c130bc21668126fb802d7c332d38a79a239984 5 files changed, 21 insertions(+), 10 deletions(-)
[2026-06-16T21:18:49.962Z] changed files:
src/shared/types.ts
tests/evals/features/archive-ready-analyzer.ts
tests/evals/features/calendaring-agent.ts
tests/evals/features/draft-generator.ts
tests/evals/runner.ts
[2026-06-16T21:18:49.962Z] Launching Electron in demo mode with --remote-debugging-port=9223...
[2026-06-16T21:19:07.307Z] CDP ready. Spawning agent...
[2026-06-16T21:19:11.016Z] session init — chrome-devtools tools: 29
[2026-06-16T21:19:22.417Z] tool#1: ToolSearch
[2026-06-16T21:19:22.418Z]   input: {"query":"select:mcp__chrome-devtools__list_pages,mcp__chrome-devtools__select_page,mcp__chrome-devtools__take_screenshot,mcp__chrome-devtools__evaluate_script,mcp__chrome-devtools__take_snapshot","max_results":5}
[2026-06-16T21:19:22.424Z] result#1:
  {"type":"tool_reference","tool_name":"mcp__chrome-devtools__list_pages"}
  {"type":"tool_reference","tool_name":"mcp__chrome-devtools__select_page"}
  {"type":"tool_reference","tool_name":"mcp__chrome-devtools__take_screenshot"}
  {"type":"tool_reference","tool_name":"mcp__chrome-devtools__evaluate_script"}
  {"type":"tool_reference","tool_name":"mcp__chrome-devtools__take_snapshot"}
[2026-06-16T21:19:24.989Z] tool#2: mcp__chrome-devtools__list_pages
[2026-06-16T21:19:24.989Z]   input: {}
[2026-06-16T21:19:25.109Z] result#2:
  ## Pages
  1: http://localhost:5173/ [selected]
[2026-06-16T21:19:27.056Z] tool#3: mcp__chrome-devtools__select_page
[2026-06-16T21:19:27.056Z]   input: {"pageId":1}
[2026-06-16T21:19:27.075Z] result#3:
  ## Pages
  1: http://localhost:5173/ [selected]
[2026-06-16T21:19:28.838Z] tool#4: mcp__chrome-devtools__take_snapshot
[2026-06-16T21:19:28.838Z]   input: {}
[2026-06-16T21:19:28.868Z] result#4:
  ## Latest page snapshot
  uid=1_0 RootWebArea "Exo" url="http://localhost:5173/"
    uid=1_1 heading "Exo" level="1"
    uid=1_2 button "me@example.com"
    uid=1_3 button "Search" description="Search (/)"
    uid=1_4 button "Compose" description="Compose (C)"
    uid=1_5 button "Settings" description="Settings"
    uid=1_6 button "Refresh" description="Refresh"
    uid=1_7 button "Inbox"
    uid=1_8 button "Sent"
    uid=1_9 button "Density: Compact" description="Density: Compact"
    uid=1_10 button "Priority12"
    uid=1_11 button "Other7"
    uid=1_12 button "Archive Ready 6"
    uid=1_13 button "Drafts 8"
    uid=1_14 button "Snoozed 2"
    uid=1_15 button "All19"
    uid=1_16 button "On-Call URGENT: Production issue affecting checkout flow Draft Joining the incident channel now. Quick context: The payment timeout issue might be related to the 15m"
    uid=1_17 button "Diana Hu Meeting Follow-up: Q4 Planning - Action Items Draft Hi Diana, Thanks for capturing these. Quick responses: 1. Budget proposal - I'll review and send c 30m"
    uid=1_18 button "Gustaf Alströmer Can you intro me to a few folks? Draft yup i'm on it 1h"
    uid=1_19 button "Dalton Caldwell friday? yo you free friday? was thinking we grab tacos... 2h"
    uid=1_20 button "Gustaf Alströmer Quick question about API rate limits Draft Hey Gustaf, Happy to help! Here are answers to your questions: 1. **Rate limit increase**: You can 3h"
    uid=1_21 button "Surbhi Sarna Landing Page Mockups - Inline Images Here are the design mockups for the new landing page... 3h"
    uid=1_22 button "Dr. Geoff Ralston Request for Strategic Advisory Input - FY2026 Planning I am writing to request your input on several strategic matters... 3h"
    uid=1_23 button "Kevin Hale Intro: Ankit <> Tim Brady (AI infrastructure) Draft Hi Tim, Great to e-meet you! Kevin has told me great things about what you're building at AIStack. 3h"
    uid=1_24 button "HR Team Interview Scheduling - Senior Engineer Candidate Draft Hi, I can do the Wednesday 10am-11am slot. Yes, please send over Jordan's resume - I'd like to rev 5h"
    uid=1_25 button "Nicolas Dessaigne Launch Readiness Review - v2.0 Release Thanks for the updates. We need your input on tooltip copy... 6h 6"
    uid=1_26 button "David Lieb Partnership technical requirements — need your input Draft Hi David, Great questions — here's a quick rundown: 1. **API throughput**: We're currently handlin 6h"
    uid=1_27 button "David Lieb Meeting to discuss partnership — finding a time Wednesday at 10am PT works perfectly! Could we do it over Zoom? 1d 4"
    uid=1_28 StaticText "Select an email to see details"
    uid=1_29 StaticText "Use j/k to navigate, Cmd+J for agent"
    uid=1_30 StaticText "j/k"
    uid=1_31 StaticText "navigate"
    uid=1_32 StaticText "Enter"
    uid=1_33 StaticText "open"
    uid=1_34 StaticText "r"
    uid=1_35 StaticText "reply"
    uid=1_36 StaticText "e"
    uid=1_37 StaticText "archive"
    uid=1_38 StaticText "u"
    uid=1_39 StaticText "unread"
    uid=1_40 StaticText "x"
    uid=1_41 StaticText "select"
    uid=1_42 StaticText "c"
    uid=1_43 StaticText "compose"
    uid=1_44 StaticText "/"
    uid=1_45 StaticText "search"
    uid=1_46 StaticText "b"
    uid=1_47 StaticText "sidebar"
    uid=1_48 StaticText "⌘K"
    uid=1_49 StaticText "commands"
  
[2026-06-16T21:19:43.862Z] text: The app is running in demo mode with emails loaded. The PR changes are primarily in eval test files with one app-level change: `DEFAULT_OLLAMA_MODEL` in `types.ts` changing from `kimi-k2.6:cloud` to `glm-5.2:cloud`. Let me verify the settings panel shows the correct new default, then do a basic email flow check.
[2026-06-16T21:19:43.865Z] tool#5: mcp__chrome-devtools__take_screenshot
[2026-06-16T21:19:43.865Z]   input: {}
[2026-06-16T21:19:44.540Z] result#5:
  Took a screenshot of the current page's viewport.
  {"type":"image","source":{"type":"base64","media_type":"image/jpeg","data":"/9j/4AAQSkZJRgABAgAAAQABAAD/wAARCAU1B88DAREAAhEBAxEB/9sAQwADAgIDAgIDAwMDBAMDBAUIBQUEBAUKBwcGCAwKDAwLCgsLDQ4SEA0OEQ4LCxAWEBETFBUVFQwPFxgWFBgSFBUU/9sAQwEDBAQFBAUJBQUJFA0LDRQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQU/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD9U6ACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKAI5ZkhXLHHtQBWOpoDwjGnYVxP7UX/nm350WC4f2ov8Azzb86LBcP7UX/nm350WC4f2ov/PNvzosFw/tRf8Anm350WC4f2ov/PNvzosFw/tRf+ebfnRYLh/ai/8APNvzosFw/tRf+ebfnRYLh/ai/wDPNvzosFw/tRf+ebfnRYLh/ai/882/OiwXD+1F/wCebfnRYLh/ai/882/OiwXD+1F/55t+dFguH9qL/wA82/OiwXD+1F/55t+dFguH9qL/AM82/OiwXD+1F/55t+dFguH9qL/zzb86LBcP7UX/AJ5t+dFguH9qL/zzb86LBcP7UX/nm350WC4f2ov/ADzb86LBcP7UX/nm350WC4f2ov8Azzb86LBcP7UX/nm350WC4f2ov/PNvzosFw/tRf8Anm350WC4f2ov/PNvzosFw/tRf+ebfnRYLh/ai/8APNvzosFw/tRf+ebfnRYLh/ai/wDPNvzosFxy6mhPKstFguWkcSLlTkHuKQx1ABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAYfjHxrofgDQptY8QalBpenw9ZZj949lUDlmPoATXZhMHiMfVVDDQcpPt/WiMataFCPPUdkfOOtf8FBvCNneNHpvh7WdTgHHnkxQZ+isc4r9Do8BY2cb1asYvtq/wAjwp53RT92Lf3Iz/8Ah4hoX/Qm6v8A+BUH+Nb/AOoGI/6CI/dIj+3Kf8j+9B/w8Q0L/oTdX/8AAqD/ABo/1AxH/QRH7pB/blP+R/eg/wCHiGhf9Cbq/wD4FQf40f6gYj/oIj90g/tyn/I/vQf8PENC/wChN1f/AMCoP8aP9QMR/wBBEfukH9uU/wCR/eg/4eIaF/0Jur/+BUH+NH+oGI/6CI/dIP7cp/yP70H/AA8Q0L/oTdX/APAqD/Gj/UDEf9BEfukH9uU/5H96D/h4hoX/AEJur/8AgVB/jR/qBiP+giP3SD+3Kf8AI/vQf8PENC/6E3V//AqD/Gj/AFAxH/QRH7pB/blP+R/eg/4eIaF/0Jur/wDgVB/jR/qBiP8AoIj90g/tyn/I/vQf8PENC/6E3V//AAKg/wAaP9QMR/0ER+6Qf25T/kf3oP8Ah4hoX/Qm6v8A+BUH+NH+oGI/6CI/dIP7cp/yP70H/DxDQv8AoTdX/wDAqD/Gj/UDEf8AQRH7pB/blP8Akf3oP+HiGhf9Cbq//gVB/jR/qBiP+giP3SD+3Kf8j+9B/wAPENC/6E3V/wDwKg/xo/1AxH/QRH7pB/blP+R/eg/4eIaF/wBCbq//AIFQf40f6gYj/oIj90g/tyn/ACP70H/DxDQv+hN1f/wKg/xo/wBQMR/0ER+6Qf25T/kf3oP+HiGhf9Cbq/8A4FQf40f6gYj/AKCI/dIP7cp/yP70H/DxDQv+hN1f/wACoP8AGj/UDEf9BEfukH9uU/5H96JLf/god4deZRN4R1iKMn5nE8DY/DPNTLgDEpaV4/dIazyn1g/wPdfhV8dfB/xjtZH8Pajuu4Rum0+6XyrmIepQ9R7rkV8TmmSY3KJJYmGj2a1T+f6M9jDYyjil+7evbqeg14J2hQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAITgEntQBjTSmaQsfwHpVEkdMQUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFAFmxmMcwX+FuKTGjVqSgoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAo …[truncated, 428617 more chars]
[2026-06-16T21:19:50.917Z] tool#6: mcp__chrome-devtools__evaluate_script
[2026-06-16T21:19:50.927Z]   input: {"function":"() => {\n  // Check for any JS errors in the console\n  return {\n    consoleErrors: window.__exoErrors__ ?? null,\n    defaultOllamaModel: window.__DEFAULT_OLLAMA_MODEL__ ?? null\n  };\n}"}
[2026-06-16T21:19:51.162Z] result#6:
  Script ran on page and returned:
  ```json
  {"consoleErrors":null,"defaultOllamaModel":null}

[2026-06-16T21:19:54.129Z] text: Let me click on an email to test basic flow, then check Settings for the Ollama model default.
[2026-06-16T21:19:55.479Z] tool#7: mcp__chrome-devtools__evaluate_script
[2026-06-16T21:19:55.479Z] input: {"function":"() => {\n // Click first email row\n const firstEmail = document.querySelector('[data-thread-id]');\n if (firstEmail) {\n firstEmail.click();\n return { clicked: true, threadId: firstEmail.getAttribute('data-thread-id') };\n }\n return { clicked: false };\n}"}
[2026-06-16T21:19:55.692Z] result#7:
Script ran on page and returned:

{"clicked":true,"threadId":"thread-bug-report"}

[2026-06-16T21:19:58.843Z] tool#8: mcp__chrome-devtools__take_screenshot
[2026-06-16T21:19:58.843Z] input: {}
[2026-06-16T21:19:59.032Z] result#8:
Took a screenshot of the current page's viewport.
{"type":"image","source":{"type":"base64","media_type":"image/jpeg","data":"/9j/4AAQSkZJRgABAgAAAQABAAD/wAARCAU1B88DAREAAhEBAxEB/9sAQwADAgIDAgIDAwMDBAMDBAUIBQUEBAUKBwcGCAwKDAwLCgsLDQ4SEA0OEQ4LCxAWEBETFBUVFQwPFxgWFBgSFBUU/9sAQwEDBAQFBAUJBQUJFA0LDRQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQU/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD9U6ACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKAI5ZkhXLHHtQBWOpoDwjGnYVxP7UX/nm350WC4f2ov8Azzb86LBcP7UX/nm350WC4f2ov/PNvzosFw/tRf8Anm350WC4f2ov/PNvzosFw/tRf+ebfnRYLh/ai/8APNvzosFw/tRf+ebfnRYLh/ai/wDPNvzosFw/tRf+ebfnRYLh/ai/882/OiwXD+1F/wCebfnRYLh/ai/882/OiwXD+1F/55t+dFguH9qL/wA82/OiwXD+1F/55t+dFguH9qL/AM82/OiwXD+1F/55t+dFguH9qL/zzb86LBcP7UX/AJ5t+dFguH9qL/zzb86LBcP7UX/nm350WC4f2ov/ADzb86LBcP7UX/nm350WC4f2ov8Azzb86LBcP7UX/nm350WC4f2ov/PNvzosFw/tRf8Anm350WC4f2ov/PNvzosFw/tRf+ebfnRYLh/ai/8APNvzosFw/tRf+ebfnRYLh/ai/wDPNvzosFxy6mhPKstFguWkcSLlTkHuKQx1ABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAYfjHxrofgDQptY8QalBpenw9ZZj949lUDlmPoATXZhMHiMfVVDDQcpPt/WiMataFCPPUdkfOOtf8FBvCNneNHpvh7WdTgHHnkxQZ+isc4r9Do8BY2cb1asYvtq/wAjwp53RT92Lf3Iz/8Ah4hoX/Qm6v8A+BUH+Nb/AOoGI/6CI/dIj+3Kf8j+9B/w8Q0L/oTdX/8AAqD/ABo/1AxH/QRH7pB/blP+R/eg/wCHiGhf9Cbq/wD4FQf40f6gYj/oIj90g/tyn/I/vQf8PENC/wChN1f/AMCoP8aP9QMR/wBBEfukH9uU/wCR/eg/4eIaF/0Jur/+BUH+NH+oGI/6CI/dIP7cp/yP70H/AA8Q0L/oTdX/APAqD/Gj/UDEf9BEfukH9uU/5H96D/h4hoX/AEJur/8AgVB/jR/qBiP+giP3SD+3Kf8AI/vQf8PENC/6E3V//AqD/Gj/AFAxH/QRH7pB/blP+R/eg/4eIaF/0Jur/wDgVB/jR/qBiP8AoIj90g/tyn/I/vQf8PENC/6E3V//AAKg/wAaP9QMR/0ER+6Qf25T/kf3oP8Ah4hoX/Qm6v8A+BUH+NH+oGI/6CI/dIP7cp/yP70H/DxDQv8AoTdX/wDAqD/Gj/UDEf8AQRH7pB/blP8Akf3oP+HiGhf9Cbq//gVB/jR/qBiP+giP3SD+3Kf8j+9B/wAPENC/6E3V/wDwKg/xo/1AxH/QRH7pB/blP+R/eg/4eIaF/wBCbq//AIFQf40f6gYj/oIj90g/tyn/ACP70H/DxDQv+hN1f/wKg/xo/wBQMR/0ER+6Qf25T/kf3oP+HiGhf9Cbq/8A4FQf40f6gYj/AKCI/dIP7cp/yP70H/DxDQv+hN1f/wACoP8AGj/UDEf9BEfukH9uU/5H96JLf/god4deZRN4R1iKMn5nE8DY/DPNTLgDEpaV4/dIazyn1g/wPdfhV8dfB/xjtZH8Pajuu4Rum0+6XyrmIepQ9R7rkV8TmmSY3KJJYmGj2a1T+f6M9jDYyjil+7evbqeg14J2hQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAITgEntQBjTSmaQsfwHpVEkdMQUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFAFmxmMcwX+FuKTGjVqSgoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAo …[truncated, 428617 more chars]
[2026-06-16T21:20:03.682Z] tool#9: mcp__chrome-devtools__evaluate_script
[2026-06-16T21:20:03.682Z] input: {"function":"() => {\n // Try clicking with a proper mouse event \n const firstEmail = document.querySelector('[data-thread-id]');\n if (firstEmail) {\n const rect = firstEmail.getBoundingClientRect();\n return { top: rect.top, left: rect.left, width: rect.width, height: rect.height, tag: firstEmail.tagName };\n }\n return null;\n}"}
[2026-06-16T21:20:03.896Z] result#9:
Script ran on page and returned:

{"top":128,"left":0,"width":879,"height":32,"tag":"DIV"}

[2026-06-16T21:20:06.649Z] tool#10: ToolSearch
[2026-06-16T21:20:06.649Z] input: {"query":"select:mcp__chrome-devtools__click","max_results":1}
[2026-06-16T21:20:06.650Z] result#10:
{"type":"tool_reference","tool_name":"mcp__chrome-devtools__click"}
[2026-06-16T21:20:09.271Z] tool#11: mcp__chrome-devtools__click
[2026-06-16T21:20:09.272Z] input: {"uid":"1_16","includeSnapshot":true}
[2026-06-16T21:20:09.642Z] result#11:
Successfully clicked on the element

Latest page snapshot

uid=1_0 RootWebArea "Exo" url="http://localhost:5173/"
uid=1_1 heading "Exo" level="1"
uid=1_2 button "me@example.com"
uid=1_3 button "Search" description="Search (/)"
uid=1_4 button "Compose" description="Compose (C)"
uid=1_5 button "Settings" description="Settings"
uid=1_6 button "Refresh" description="Refresh"
uid=2_0 button "Back"
uid=2_1 heading "URGENT: Production issue affecting checkout flow" level="1"
uid=2_2 button "Archive"
uid=2_3 button "Delete"
uid=2_4 button "Mark as unread"
uid=2_5 button "Star"
uid=2_6 button "Snooze (h)"
uid=2_7 button "Reply All"
uid=2_8 button "Forward"
uid=2_9 button "On-Call to Me 6/16/2026 Reply Reply All Forward Block sender"
uid=2_10 button "Reply"
uid=2_11 button "Reply All"
uid=2_12 button "Forward"
uid=2_13 button "Block sender" description="Block oncall@acmecorp.com"
uid=2_14 StaticText "INCIDENT ALERT

Severity: P1
Status: Investigating
Affected: Checkout flow - payment processing

Issue:
Users are reporting failed payments with error code PAYMENT_TIMEOUT. Started approximately 20 minutes ago.

Current impact:

  • ~5% of checkout attempts failing
  • Estimated revenue impact: $2,000/hour

We need your expertise on the payment integration. Can you join the incident channel?

Slack: #incident-checkout-012
Zoom: https://zoom.us/j/123456789


On-Call Team"
uid=2_15 button "Reply to On-Call"
uid=2_16 button "Discard draft"
uid=2_17 button "Bold (Cmd+B)"
uid=2_18 button "Italic (Cmd+I)"
uid=2_19 button "Strikethrough"
uid=2_20 button "Bullet list"
uid=2_21 button "Numbered list"
uid=2_22 button "Quote"
uid=2_23 button "Insert link"
uid=2_24 button "Insert image"
uid=2_25 button "Align left"
uid=2_26 button "Align center"
uid=2_27 generic value="Joining the incident channel now.

Quick context: The payment timeout issue might be related to the third-party payment processor. I saw similar symptoms last month when they had latency issues.

First steps I'd recommend:

  1. Check our payment provider's status page
  2. Look at p99 latency on the payment service
  3. Consider enabling the fallback provider if this persists

Hopping on Zoom now."
uid=2_28 paragraph
uid=2_29 StaticText "Joining the incident channel now."
uid=2_30 paragraph
uid=2_31 StaticText "Quick context: The payment timeout issue might be related to the third-party payment processor. I saw similar symptoms last month when they had latency issues."
uid=2_32 paragraph
uid=2_33 StaticText "First steps I'd recommend:"
uid=2_34 LineBreak "
"
uid=2_35 StaticText "1. Check our payment provider's status page"
uid=2_36 LineBreak "
"
uid=2_37 StaticText "2. Look at p99 latency on the payment service"
uid=2_38 LineBreak "
"
uid=2_39 StaticText "3. Consider enabling the fallback provider if this persists"
uid=2_40 paragraph
uid=2_41 StaticText "Hopping on Zoom now."
uid=2_42 button "···" description="Show original"
uid=2_43 textbox "Refine with AI... e.g. 'make it shorter' or 'more formal'"
uid=2_44 button "Refine" disableable disabled
uid=2_45 button "Send"
uid=2_46 button "Schedule" description="Schedule send"
uid=2_47 button "Attach file" description="Attach file"
uid=2_48 StaticText "Cmd+Enter to send"
uid=2_49 StaticText "Priority"
uid=2_50 StaticText "·"
uid=2_51 StaticText "Production incident requiring immediate attention"
uid=2_52 button "Change"
uid=2_53 button "Sender" description="Sender (press b to switch)"
uid=2_54 button "Calendar" description="Calendar (press b to switch)"
uid=2_55 button "Agent" description="Agent (press b to switch)"
uid=2_56 StaticText "O"
uid=2_57 StaticText "On-Call"
uid=2_58 StaticText "oncall@acmecorp.com"
uid=2_59 heading "SENDER" level="3"
uid=2_60 StaticText "O"
uid=2_61 StaticText "On-Call"
uid=2_62 StaticText "oncall@acmecorp.com"
uid=2_63 button " …[truncated, 680 more chars]
[2026-06-16T21:20:17.321Z] text: Email detail view opened correctly with draft visible. Now let me check the Settings to verify the Ollama Cloud model default shows glm-5.2:cloud (the key change in this PR).
[2026-06-16T21:20:17.326Z] tool#12: mcp__chrome-devtools__click
[2026-06-16T21:20:17.326Z] input: {"uid":"1_5","includeSnapshot":false}
[2026-06-16T21:20:17.537Z] result#12:
Successfully clicked on the element
[2026-06-16T21:20:21.539Z] tool#13: mcp__chrome-devtools__take_snapshot
[2026-06-16T21:20:21.539Z] input: {}
[2026-06-16T21:20:21.614Z] result#13:

Latest page snapshot

uid=1_0 RootWebArea "Exo" url="http://localhost:5173/"
uid=3_0 heading "Settings" level="1"
uid=3_1 button "Close settings"
uid=3_2 button "General"
uid=3_3 button "Accounts"
uid=3_4 button "Blocked"
uid=3_5 button "Calendar"
uid=3_6 button "Splits"
uid=3_7 button "Snippets"
uid=3_8 button "Signatures"
uid=3_9 button "Prompts"
uid=3_10 button "Writing Style"
uid=3_11 button "Executive Assistant"
uid=3_12 button "AI Memories"
uid=3_13 button "Queue"
uid=3_14 button "Agents"
uid=3_15 button "Extensions"
uid=3_16 button "Analytics"
uid=3_17 heading "General Settings" level="2"
uid=3_18 StaticText "Configure how Exo generates draft replies."
uid=3_19 heading "Appearance" level="3"
uid=3_20 StaticText "Choose your preferred color theme."
uid=3_21 button "Light"
uid=3_22 button "Dark"
uid=3_23 button "System"
uid=3_24 heading "Inbox Density" level="3"
uid=3_25 StaticText "Control how much space each email takes in the inbox list."
uid=3_26 button "Default"
uid=3_27 button "Compact"
uid=3_28 heading "Keyboard Shortcuts" level="3"
uid=3_29 StaticText "Choose which keyboard shortcut preset to use. Gmail adds extra bindings like o, n/p, y, z, a, and section navigation."
uid=3_30 button "Superhuman"
uid=3_31 button "Gmail"
uid=3_32 heading "Undo Send" level="3"
uid=3_33 StaticText "Set a delay before emails are actually sent. During the delay you can click Undo to cancel."
uid=3_34 button "Off"
uid=3_35 button "5s"
uid=3_36 button "10s"
uid=3_37 button "15s"
uid=3_38 button "30s"
uid=3_39 heading "Send & Archive" level="3"
uid=3_40 StaticText "When replying, sending also archives the conversation. New emails and forwards are unaffected."
uid=3_41 button "Toggle Send and Archive"
uid=3_42 heading "Default Mail App" level="3"
uid=3_43 StaticText "Register Exo as the default handler for mailto: links. Clicking email links in other apps will open a compose window here."
uid=3_44 switch "Set as default mail app"
uid=3_45 heading "Sender Lookup Search" level="3"
uid=3_46 StaticText "Which search backend to use when looking up info about email senders. The model used to parse the results is set below under "
uid=3_47 StaticText "Sender Lookup"
uid=3_48 StaticText " in AI Models."
uid=3_49 StaticText "Backend"
uid=3_50 combobox "Backend" expandable haspopup="menu" value="Anthropic (Claude web_search — single call, no extra key)"
uid=3_51 option "Anthropic (Claude web_search — single call, no extra key)" selectable selected value="Anthropic (Claude web_search — single call, no extra key)"
uid=3_52 option "Exa (search API + configurable parsing model)" selectable value="Exa (search API + configurable parsing model)"
uid=3_53 heading "AI Models" level="3"
uid=3_54 StaticText "Choose which Claude model to use for each feature. Haiku is fastest and cheapest, Opus is most capable."
uid=3_55 StaticText "Email Analysis"
uid=3_56 StaticText "Triaging which emails need replies"
uid=3_57 combobox "Provider for Email Analysis" expandable haspopup="menu" value="Anthropic"
uid=3_58 option "Anthropic" selectable selected value="Anthropic"
uid=3_59 option "Ollama Cloud" selectable value="Ollama Cloud"
uid=3_60 combobox "Model tier for Email Analysis" expandable haspopup="menu" value="Sonnet (balanced)"
uid=3_61 option "Haiku (fast, lightweight)" selectable value="Haiku (fast, lightweight)"
uid=3_62 option "Sonnet (balanced)" selectable selected value="Sonnet (balanced)"
uid=3_63 option "Opus (most capable)" selectable value="Opus (most capable)"
uid=3_64 StaticText "Draft Generation"
uid=3_65 StaticText "Writing reply drafts"
uid=3_66 combobox "Provider for Draft Generation" expandable haspopup="menu" value="Anthropic"
uid=3_67 option "Anthropic" selectable selected value="Anthropic"
uid=3_68 option "Ollama Cloud" selectable value="Ollama Cloud"
uid=3_69 combobox "Model tier for Dra …[truncated, 10912 more chars]
[2026-06-16T21:20:38.746Z] text: Settings panel opened. I can see all the AI model configurations. Let me check the "Agents" tab for the Ollama Cloud model setting (where DEFAULT_OLLAMA_MODEL would be relevant).
[2026-06-16T21:20:38.764Z] tool#14: mcp__chrome-devtools__click
[2026-06-16T21:20:38.765Z] input: {"uid":"3_14","includeSnapshot":false}
[2026-06-16T21:20:39.334Z] result#14:
Successfully clicked on the element
[2026-06-16T21:20:42.070Z] tool#15: mcp__chrome-devtools__take_snapshot
[2026-06-16T21:20:42.071Z] input: {}
[2026-06-16T21:20:42.109Z] result#15:

Latest page snapshot

uid=1_0 RootWebArea "Exo" url="http://localhost:5173/"
uid=3_0 heading "Settings" level="1"
uid=3_1 button "Close settings"
uid=3_2 button "General"
uid=3_3 button "Accounts"
uid=3_4 button "Blocked"
uid=3_5 button "Calendar"
uid=3_6 button "Splits"
uid=3_7 button "Snippets"
uid=3_8 button "Signatures"
uid=3_9 button "Prompts"
uid=3_10 button "Writing Style"
uid=3_11 button "Executive Assistant"
uid=3_12 button "AI Memories"
uid=3_13 button "Queue"
uid=3_14 button "Agents" focusable focused
uid=3_15 button "Extensions"
uid=3_16 button "Analytics"
uid=4_0 heading "Agent Settings" level="3"
uid=4_1 StaticText "Configure AI agent capabilities including browser automation."
uid=4_2 heading "Authentication" level="4"
uid=4_3 heading "Anthropic API Key" level="5"
uid=4_4 StaticText "Required for email analysis, draft generation, and sender lookup."
uid=4_5 textbox "sk-ant-..."
uid=4_6 button "Save"
uid=4_7 heading "Claude Agent" level="5"
uid=4_8 StaticText "The agent can also authenticate via your Claude account. If you have Claude Code installed and logged in, this is detected automatically."
uid=4_9 StaticText "Logged in"
uid=4_10 button "Login with Claude Account"
uid=4_11 StaticText "An API key above also enables the agent. Claude Account login is only needed if you don't have an API key."
uid=4_12 heading "Browser Automation" level="4"
uid=4_13 StaticText "Allow agents to browse the web using Chrome DevTools Protocol. Requires Chrome to be running with remote debugging enabled."
uid=4_14 button
uid=4_15 button "Save"
uid=4_16 heading "Custom MCP Servers" level="4"
uid=4_17 StaticText "Add MCP servers to give the agent access to custom tools. Paste the JSON config from your MCP server's docs."
uid=4_18 button "+ Add MCP Server"
uid=4_19 heading "CLI Tools" level="4"
uid=4_20 StaticText "Allow the agent to run specific CLI commands. Each command becomes a dedicated tool the agent can call."
uid=4_21 button "+ Add CLI Tool"
uid=4_22 button "Save"
uid=4_23 heading "Additional Tool Directories" level="4"
uid=4_24 StaticText "Extra directories to add to the system PATH so agents can find CLI tools installed in non-standard locations."
uid=4_25 button "+ Add Directory"
uid=4_26 button "Save"
uid=4_27 StaticText "Available agent capabilities:"
uid=4_28 StaticText "Email tools:"
uid=4_29 StaticText " Read, search, archive, label, draft, and send emails"
uid=4_30 StaticText "Analysis:"
uid=4_31 StaticText " Analyze emails and look up sender profiles"
uid=4_32 StaticText "Web search:"
uid=4_33 StaticText " Search the web for context"
uid=4_34 StaticText "Browser:"
uid=4_35 StaticText " Navigate and extract from web pages (requires Chrome debugging)"
uid=4_36 StaticText "Custom MCP:"
uid=4_37 StaticText " Any tools provided by your configured MCP servers"
uid=4_38 StaticText "CLI tools:"
uid=4_39 StaticText " Run configured CLI commands"
uid=4_40 StaticText "Batch operations:"
uid=4_41 StaticText " Modify labels on multiple emails at once"
uid=1_1 heading "Exo" level="1"
uid=1_2 button "me@example.com"
uid=1_3 button "Search" description="Search (/)"
uid=1_4 button "Compose" description="Compose (C)"
uid=1_5 button "Settings" description="Settings"
uid=1_6 button "Refresh" description="Refresh"
uid=2_0 button "Back"
uid=2_1 heading "URGENT: Production issue affecting checkout flow" level="1"
uid=2_2 button "Archive"
uid=2_3 button "Delete"
uid=2_4 button "Mark as unread"
uid=2_5 button "Star"
uid=2_6 button "Snooze (h)"
uid=2_7 button "Reply All"
uid=2_8 button "Forward"
uid=2_9 button "On-Call to Me 6/16/2026 Reply Reply All Forward Block sender"
uid=2_10 button "Reply"
uid=2_11 button "Reply All"
uid=2_12 button "Forward"
uid=2_13 button "Block sender" description="Block oncall@acmecorp.com"
uid=2_14 StaticText "INCIDENT ALERT

Severity: P1
Status: …[truncated, 3834 more chars]
[2026-06-16T21:20:58.950Z] text: Agents settings loads correctly. Now let me close settings, navigate back to the inbox, and do a final check on email navigation state to confirm nothing is broken.
[2026-06-16T21:20:59.357Z] tool#16: mcp__chrome-devtools__click
[2026-06-16T21:20:59.357Z] input: {"uid":"3_1","includeSnapshot":false}
[2026-06-16T21:20:59.570Z] result#16:
Successfully clicked on the element
[2026-06-16T21:21:02.697Z] tool#17: mcp__chrome-devtools__take_screenshot
[2026-06-16T21:21:02.697Z] input: {}
[2026-06-16T21:21:03.001Z] result#17:
Took a screenshot of the current page's viewport.
{"type":"image","source":{"type":"base64","media_type":"image/png","data":"iVBORw0KGgoAAAANSUhEUgAAB88AAAU1CAIAAAAS6RJTAAbaRklEQVR4Ae3AA6AkWZbG8f937o3IzKdyS2Oubdu2bdu2bdu2bWmMnpZKr54yMyLu+Xa3anqmhztr1a/a5qqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676NwGA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKA4Kqrrrrqqquuuuqqq6666qqrrrrqqquuuuqqq676twKAylVXXXXVVVddddVVV1111VVXXXXVVVddddVVV/3nyMzWsmU6bcx/CUmhKCVKCUn8JwMA2eaqq6666qqrrrrqqquuuuqqq6666qqrrvp3s0lj2+aq/zMkJIWQuOpfZZraOE4tk/9WXa1dVyKC/zQAINtcddVVV1111VVXXXXVVVddddVVV1111VX/DjYtXYu46v+uqbmEJK76F9kehnEYm8R/O4Ng1nddV/nPAQCyzVVXXfUiMi3TdtrYXHXVVf9mkkBSRESIq6666qqrrrrqqquuuuqq/83ShABa8g9PG26/dzx/Kdejuep/v77q5E7cdE197EP6WS8gTYirXojWcr0eWlrifxBTa8xmvST+owGAbHPVVVf9SzI9TlNradu2ba666qp/H0kChULR1VJq4aqrrrrqqquuuuqqq6666n8bgwA4d6n91G8f/tHfre670O7bbfuHOTaLq/53M9Si7YVOHy/Xniwv95j5273O5g2nK2AjcdXzai1X68E2//PY1BLzeS+J/1AAINtcddVVL9QwjOPUbNuWxFVXXfUfykailJj1fYS46qqrrrrqqquuuuqqq676X8JGAviJ3zr8zp/bu/fCtFp7PlNXVIokrvo/wKalp8ZynX2n08fKu77R9nu/2TZgI3HVA9lertaZ5n8qQ1/LbNbzHwoAZJurrrrqBcj0MAxTS6666qr/fJJmfa21ctVVV1111VVXXXXVVVdd9T9emhAX9/PLf+Dir/7JUUtmVRGkAWyu+j9DgAiRZhgNvMZLLz75PU9cd6pkEsFVz7JeD+PU+B9vPutqrfzHAQDZ5qqrrnp+Mr1aDy1TXHXVVf91Zn3XdZWrrrrqqquuuuqqq6666qr/wdKEOH+pfdI3nP+Tv19vb8hgc9X/eRKCg6Uf+9D+yz/i1E3X1EwiuApoLZerNf/j2ZSixXwmif8gABBcddVVz5e9Xg8tU1x11VX/pdbDOE2Nq6666qqrrrrqqquuuuqq/6lsQgBf/L0X//jvVse2Io3NVf8f2KTZ2Yx/eOrwed95YRgdgc1VwDhO/G8gkelpavzHAYDgqquuen6GcWotxVVXXfXfYBhH21x11VVXXXXVVVddddVVV/2PJAH8yK8f/PqfLY9txdTMVf/PTM3HtuKP/379Xb+wD0hcZbtl8r+E7daS/zgAEFx11VXPo7Ucxglx1VVX/bdoLYdh5Kqrrrrqqquuuuqqq6666n+qO+6bvuknLnVVLbnq/6epeTHTd/7s3uNvHQDz/11ric3/EpIynWn+gwBAcNVVVz2PcZy46qqr/vtImlpmJlddddVVV1111VVXXXXVVf/DZAL86G8c7B9lCa76/0xinPzDv3rAVZBp879JOm3zHwQAgquuuuo52Z5a46qrrvpvlZlTa1x11VVXXXXVVVddddVVV/1PYhPBavBv/fkyzVX/z9mU0O/81fLCXhOY/9ds2+Z/D9vG/AcBgOCqq656TtOUXHXVVf/tpGy2ueqqq6666qqrrrrqqquu+p8jDfBnj1sfLrMWYa76fy6CcfIf/d0acPL/mTH/65j/KAAQXHXVVc8pM7nqqqv+uwkyE8xVV1111VVXXXXVVVdd9f+YbZvnYpv/JraBJ98+7B9lCcxV/99JrAY/4RkDYP572OZ+trnqvwMABFddddVzSidXXXXV/wBp21x11VVXXXXVVVddddVV/z9lpm1JEs9FEtBa8l/OBrjrXFsNjuCqq0JMjTvumwDb/NeyDUiyDQCSbPO/h+3M5HnYtt1aa6211lprmWmb/6kAoHLVVVc9Jxsbiauuuuq/l20wiKuuuuqqq6666qqrrrrq/5nWWikFOHvuwpOf+vTlcvXXf/sP6/Vw7TVnbrzh2pd4sUffeMN1pURrCS6l8F/rcJlTQ+KqqxCZPlgm/+VsS7L9uCc8+ad+7pe3t7ZvufmGt3mLN5JkWxL/49nuuq7rutVqZZvLbE9Tq7VExPbWBvdLMwzjer22XWvlfxgAqFx11b/eNE28KKRSirjqqquuuuqqq6666qqrrrrqqquuelG11kopd951zx/+yV/86E/8/MVLl/qub9mAEsU24q3e7A1e9zVf9ZGPeCjQWpYSXHXVfx/xX6S1TKeQREQAP/9Lv/E13/idXdcB4zhe3L30vu/xjrYzbWw7pFIK/yOVUu65596z584/7KEPns1mtltrEXHi+PZqPR4eHv3Jn/3F7qW9iMC+9tprbrz+ultuvgG4eGk/pFIK/2MAQOWqq/71aq1cddVVV1111VVXXXXVVVddddVVV/1Ha5mllN/+vT/69u/5kWfcdsfJE8dBq9XqumvPKHTu/MXDw6NjO9s/+pM//8u/9jtv8Dqv/tqv+aov9phH2JbE/yERODFXXfVsmVlKFIL7/dbv/tEP/MhPLxbzrc0NxN7e4c/8/K8+4qEPfo1Xe0UewDZI4n+UiFit1l/xNd/087/8a9/9rV/3Wq/+SucvXjp14tjewdG3fdcP/P0/POHoaPkbv/179509V0qx/ZAH3/LIRzzsIQ+65dVe5RXf6s3fMJODw8OI4H8GAKhcddW/0jCO3/adP7BcLksJm+dLUmvt5Injb/+2b7GzvWVbElf9J7PNc5LE/ycRAWQmV1111VVXXXXVVVddddVVV/0v1FqWEt/5fT/64z/1C0jb21t9373Vm73BS77EY47v7EjaOzgYh/GXfu23f+8P/rS19sM/8XO/90d/9gkf9UEv9zIvYVsS/ydkcrTyrFMp/PtJAGkwV0QA2Fz1v0tE/OKv/tYf/clf3nTDdS/+Yo+6+cYbfvYXfnVvf/9oufqYD3//G2+47gu+7Ovuuvven/zZX7rppuvPn7/4t3//+FufcceLv9ij3vFt3xywLYn/GWx3Xb3r7nt+8Vd+49y5C3/253/1aq/8Csd2dn7l137nq7/hW//u7x+3f3DY1bqxsTh96iSXnT17/t77zv7yr/7mz/z8L/34T/3ce7zrO7z+677mwcFRhPgfAAAqV131r7Rarb/xW7/7woWLXVcNIUmyQTxLSOM4PuhBN7/h67/2zvaWQVz1nygzS0TtughJsrEzM6ep2Y4I/h+wPQwjUGvhqquuuuqqq6666qqrrrrqqv9tptZqKb/+W7//wz/+s7WUrY2N13+dV3vTN369m264juf0Mi/14k9521u/9bt+8ElPftqlS/vf8K3f+6Wf/ymnT520LYn/zSTWg2++tn7kOx7/nl/c+7unDouZMvm3EShYrQ10VREAmawGS8x7ZXLV/yIf+ymf96QnP3330t6s7371N39ve2vz4PCwtXyFl33JV33ll5/P+jd7o9f9lu/8wac/447P+vyvXK+ …[truncated, 594996 more chars]
[2026-06-16T21:21:15.033Z] text: App is functioning well. Let me do one final check — press Escape to go back to inbox and verify the back-navigation state (no stale selection).
[2026-06-16T21:21:15.522Z] tool#18: ToolSearch
[2026-06-16T21:21:15.522Z] input: {"query":"select:mcp__chrome-devtools__press_key","max_results":1}
[2026-06-16T21:21:15.527Z] result#18:
{"type":"tool_reference","tool_name":"mcp__chrome-devtools__press_key"}
[2026-06-16T21:21:18.247Z] tool#19: mcp__chrome-devtools__press_key
[2026-06-16T21:21:18.247Z] input: {"key":"Escape","includeSnapshot":false}
[2026-06-16T21:21:18.472Z] result#19:
Successfully pressed key: Escape
[2026-06-16T21:21:22.372Z] tool#20: mcp__chrome-devtools__evaluate_script
[2026-06-16T21:21:22.372Z] input: {"function":"() => {\n const selectedRows = document.querySelectorAll("[data-thread-id][data-selected='true']").length;\n const senderName = document.querySelector("[data-testid='sidebar-sender-name']")?.textContent ?? null;\n const emptySidebar = document.body.innerText.includes("Select an email to see details");\n return { selectedRows, senderName, emptySidebar };\n}"}
[2026-06-16T21:21:22.585Z] result#20:
Script ran on page and returned:

{"selectedRows":1,"senderName":"On-Call","emptySidebar":false}

[2026-06-16T21:21:26.506Z] tool#21: mcp__chrome-devtools__take_screenshot
[2026-06-16T21:21:26.506Z] input: {}
[2026-06-16T21:21:26.700Z] result#21:
Took a screenshot of the current page's viewport.
{"type":"image","source":{"type":"base64","media_type":"image/jpeg","data":"/9j/4AAQSkZJRgABAgAAAQABAAD/wAARCAU1B88DAREAAhEBAxEB/9sAQwADAgIDAgIDAwMDBAMDBAUIBQUEBAUKBwcGCAwKDAwLCgsLDQ4SEA0OEQ4LCxAWEBETFBUVFQwPFxgWFBgSFBUU/9sAQwEDBAQFBAUJBQUJFA0LDRQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQU/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD9U6ACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKAI5ZkhXLHHtQBWOpoDwjGnYVxP7UX/nm350WC4f2ov8Azzb86LBcP7UX/nm350WC4f2ov/PNvzosFw/tRf8Anm350WC4f2ov/PNvzosFw/tRf+ebfnRYLh/ai/8APNvzosFw/tRf+ebfnRYLh/ai/wDPNvzosFw/tRf+ebfnRYLh/ai/882/OiwXD+1F/wCebfnRYLh/ai/882/OiwXD+1F/55t+dFguH9qL/wA82/OiwXD+1F/55t+dFguH9qL/AM82/OiwXD+1F/55t+dFguH9qL/zzb86LBcP7UX/AJ5t+dFguH9qL/zzb86LBcP7UX/nm350WC4f2ov/ADzb86LBcP7UX/nm350WC4f2ov8Azzb86LBcP7UX/nm350WC4f2ov/PNvzosFw/tRf8Anm350WC4f2ov/PNvzosFw/tRf+ebfnRYLh/ai/8APNvzosFw/tRf+ebfnRYLh/ai/wDPNvzosFxy6mhPKstFguWkcSLlTkHuKQx1ABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAYfjHxrofgDQptY8QalBpenw9ZZj949lUDlmPoATXZhMHiMfVVDDQcpPt/WiMataFCPPUdkfOOtf8FBvCNneNHpvh7WdTgHHnkxQZ+isc4r9Do8BY2cb1asYvtq/wAjwp53RT92Lf3Iz/8Ah4hoX/Qm6v8A+BUH+Nb/AOoGI/6CI/dIj+3Kf8j+9B/w8Q0L/oTdX/8AAqD/ABo/1AxH/QRH7pB/blP+R/eg/wCHiGhf9Cbq/wD4FQf40f6gYj/oIj90g/tyn/I/vQf8PENC/wChN1f/AMCoP8aP9QMR/wBBEfukH9uU/wCR/eg/4eIaF/0Jur/+BUH+NH+oGI/6CI/dIP7cp/yP70H/AA8Q0L/oTdX/APAqD/Gj/UDEf9BEfukH9uU/5H96D/h4hoX/AEJur/8AgVB/jR/qBiP+giP3SD+3Kf8AI/vQf8PENC/6E3V//AqD/Gj/AFAxH/QRH7pB/blP+R/eg/4eIaF/0Jur/wDgVB/jR/qBiP8AoIj90g/tyn/I/vQf8PENC/6E3V//AAKg/wAaP9QMR/0ER+6Qf25T/kf3oP8Ah4hoX/Qm6v8A+BUH+NH+oGI/6CI/dIP7cp/yP70H/DxDQv8AoTdX/wDAqD/Gj/UDEf8AQRH7pB/blP8Akf3oP+HiGhf9Cbq//gVB/jR/qBiP+giP3SD+3Kf8j+9B/wAPENC/6E3V/wDwKg/xo/1AxH/QRH7pB/blP+R/eg/4eIaF/wBCbq//AIFQf40f6gYj/oIj90g/tyn/ACP70H/DxDQv+hN1f/wKg/xo/wBQMR/0ER+6Qf25T/kf3oP+HiGhf9Cbq/8A4FQf40f6gYj/AKCI/dIP7cp/yP70H/DxDQv+hN1f/wACoP8AGj/UDEf9BEfukH9uU/5H96JLf/god4deZRN4R1iKMn5nE8DY/DPNTLgDEpaV4/dIazyn1g/wPdfhV8dfB/xjtZH8Pajuu4Rum0+6XyrmIepQ9R7rkV8TmmSY3KJJYmGj2a1T+f6M9jDYyjil+7evbqeg14J2hQAUAFABQAUAFABQAUAFABQBVu7zyPlXlz+lNIVyi13Kx/1jfhxTFcT7TL/z0b86AD7TL/z0b86AD7TL/wA9G/OgA+0y/wDPRvzoAkiv5Izyd49DRYLmlHIJUDL0NSUPoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKAEJwCT2oAxppTNIWP4D0qiSOmIKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgAoAKACgCzYzGOYL/AAtxSY0atSUFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAUAFABQAU …[truncated, 487957 more chars]
[2026-06-16T21:21:49.471Z] text: The app is behaving correctly. Pressing Escape returned to the inbox list with the "On-Call" row highlighted (as expected — the app keeps the last-viewed email selected in the list as a UX pattern, showing where you were). The sender panel on the right continues to show the last-viewed email's context, which is the app's intended behavior. This is not a regression — the PR doesn't touch any selection or navigation logic.

Summary of what I verified:

  • The primary user-visible change in this PR is DEFAULT_OLLAMA_MODEL switching from kimi-k2.6:cloud to glm-5.2:cloud (only affects Ollama Cloud users, not demo mode)
  • The eval file changes (runner.ts, draft-generator.ts, calendaring-agent.ts, archive-ready-analyzer.ts) are test infrastructure only — they pass explicit model IDs to service constructors to prevent 404s when old model IDs retire; no UI impact
  • App loads correctly, email navigation works, draft is visible, sender profile sidebar renders, Settings panel (including Agents tab) opens cleanly
  • No JS errors detected, no layout breaks, no stuck states

{"verdict":"pass","summary":"PR makes two categories of changes: (1) switches DEFAULT_OLLAMA_MODEL from kimi-k2.6:cloud to glm-5.2:cloud in types.ts — only affects Ollama Cloud users, invisible in Anthropic/demo mode; (2) updates eval test constructors to pass explicit model IDs instead of relying on no-arg defaults — test infrastructure change with no UI impact. Verified: app loads cleanly, email detail view opens with draft content and sender profile panel, Settings panel with all tabs (General AI Models, Agents) opens correctly, back-navigation via Escape returns to inbox list with expected sticky selection state. No JS errors, no layout breaks, no stuck states observed.","anomalies":[],"actions_taken":13}
[2026-06-16T21:21:49.518Z] result: subtype=success cost=0.6221611999999999 turns=22


</details>


<sub>This comment is upserted by `npm run pre-pr`. The CI gate reads the marker block in the PR description, not this comment.</sub>

@ankitvgupta ankitvgupta merged commit 798ef7e into main Jun 16, 2026
14 of 17 checks passed
@ankitvgupta ankitvgupta deleted the ankitvgupta/kimi-k2-ollama-default branch June 16, 2026 21:46
ankitvgupta added a commit that referenced this pull request Jun 16, 2026
## Summary

Follow-up to #175 (make GLM 5.2 the default Ollama model). That PR
changed `DEFAULT_OLLAMA_MODEL`, but the change is a **no-op for anyone
who had already enabled Ollama** — so the Settings UI and the agent both
kept using `kimi-k2.6:cloud`.

## Root cause

`DEFAULT_OLLAMA_MODEL` is only the *fallback* when
`ollamaCloud.defaultModel` is unset. But `SetupWizard` and
`ExtensionsTab` **persist** `DEFAULT_OLLAMA_MODEL` into
`ollamaCloud.defaultModel` whenever Ollama is set up/saved, and
`resolveAgentOllamaConfig` resolves the agent model as
`featureModels.agentDrafter ?? oc.defaultModel ?? DEFAULT_OLLAMA_MODEL`
— so the persisted `kimi-k2.6:cloud` wins over the new constant.
Existing installs therefore stay pinned to kimi-k2.6 (reported: the
"Default Model" field still shows `kimi-k2.6:cloud` after #175).

## Fix

A `configVersion` 3 migration in `getConfig()` that flips a persisted
`ollamaCloud.defaultModel` of **exactly** `"kimi-k2.6:cloud"` (the old
default) to the new `DEFAULT_OLLAMA_MODEL` (`glm-5.2:cloud`). Any
explicitly-chosen model is left untouched. Mirrors the existing
legacy-`model` migration pattern in the same function.

## Verification

Launched dev with a config persisted at `kimi-k2.6:cloud` → after
launch, `settings.get()` returns `configVersion: 3` and
`ollamaCloud.defaultModel: "glm-5.2:cloud"`. The Settings "Default
Model" field now shows GLM.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- devin-review-badge-begin -->

---

<a href="https://app.devin.ai/review/ankitvgupta/exo/pull/177"
target="_blank">
  <picture>
<source media="(prefers-color-scheme: dark)"
srcset="https://static.devin.ai/assets/gh-open-in-devin-review-dark.svg?v=1">
<img
src="https://static.devin.ai/assets/gh-open-in-devin-review-light.svg?v=1"
alt="Open in Devin Review">
  </picture>
</a>
<!-- devin-review-badge-end -->

<!-- PRE-PR-REPORT-START SHA=1e20bfb mode=full -->
**Pre-PR verdict**: PASS

- mode: `full`
- sha: `1e20bfb`
- generated: 2026-06-16T22:31:37.864Z

| Phase | Status | Duration |
|---|---|---|
| eval:analyzer | ✅ exit 0 | 13.4s |
| eval:features | ✅ exit 0 | 29.7s |
| agentic-verify | ✅ exit 0 | 78.3s |
| real-gmail:cached | ✅ exit 0 | 8.0s |

<!-- PRE-PR-REPORT-END -->

---------

Co-authored-by: Ankit Gupta <ankit@ycombinator.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ankitvgupta pushed a commit that referenced this pull request Jun 16, 2026
Following the merge of #175 (GLM 5.2 as DEFAULT_OLLAMA_MODEL), update the
curated dropdown list to match: GLM 5.2 is now the default-first entry, the
"(default)" label moves off Kimi, and glm-5.2:cloud is added so the active
default actually renders as a dropdown option (it was absent, which would
have forced the field into "Custom…" mode). MiniMax M3 / M2.7 remain in the
picker. Dropped the now-superseded glm-4.6 entry.

Verified live: the Default Model field selects "GLM 5.2 — z.ai (default)" and
MiniMax remains selectable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant