feat(agent): let the Cate agent control panels (cate-control)#211
feat(agent): let the Cate agent control panels (cate-control)#211architawr wants to merge 26 commits into
Conversation
Replaces the spike sentinel interception in agentStore.handleEvent with a real dispatchCateRequest round-trip, adds the side-effect cateExecutors import, registers each chat's CateControlContext from AgentPanel (under the top-level CanvasStoreProvider, whose store is the active workspace canvas), and adds requestCateApproval/resolveCateApproval plus the cate:-prefix branch in handleApproval. Also makes the cateControl executor holder hoisted-function-based so the import-cycle (cateControl -> agentStore -> cateExecutors -> cateControl) registration is TDZ-safe regardless of module entry order, and polyfills `self` in the node test env so .test.ts suites that transitively import terminalRegistry (xterm) can load.
- open_panel/run_in_terminal: send the command to the PTY via condition-based waiting (poll until node-pty registers) instead of a 250ms guess; stop relying on createTerminal's initialInput, which the store never forwards. - add cate_set_markdown_preview tool + preview option on cate_reveal_in_editor, wired to appStore.setPanelMarkdownPreview (the app already supports preview; the agent just had no way to trigger it).
…review Drives the real renderer dispatcher via window.__cateE2E.cateControl (as an agent tool would) and observes the live app: run_in_terminal and open_panel(terminal,command) actually execute in a spawned PTY (asserted via command output in the xterm buffer), and set_markdown_preview flips the editor into preview. Adds terminalText + cateControl e2e harness hooks.
|
We have similar behaviour implicit by letting any agent write to the workspace.json file. Not quite the same but it fits in more into the overall application instead of placing the cateControl system on top. I do see the advantages and I like it being a pi extension. What I don't like so much about it: I don't think the agent should control zoom/camera position. Additionally I noticed when creating a new panel instead of focusing that one it just pan's to a random location. two things I would like to see here:
|
|
Removed the skill.md for workspaces in #214 so this one is supposed to be the new way how agents can control the workspace/do orchestration. Will be merged once the changes are in and the feature is polished. Thanks! |
…en-focus Addresses review on 0-AI-UG#211 (keep the agent toolset lean + focused). - Drop camera-control tools: remove `pan_to` (was identical to `focus_panel` — both just focusAndCenter) and `zoom` (the agent shouldn't drive zoom/viewport). - Fold `reveal_in_editor` into `open_panel`: open_panel now focuses+centers what it opens and accepts target.preview for markdown, so the dedicated reveal tool was redundant. - Add `read_terminal`: read a terminal panel's recent buffer (visible screen + scrollback) as text, so an agent can inspect output it ran via run_in_terminal — the other half of terminal orchestration. Net 13 → 11 tools. Fix the "new panel pans to a random location" bug: execOpenPanel never focused the panel it created and estimated the viewport center as the centroid of all nodes (could be far off-screen). Now it centers on the real viewport (via viewToCanvas + containerSize) and focusAndCenters the opened panel so it lands in view.
Addresses review on 0-AI-UG#211 (render tool calls as custom UI, not raw JSON; make the approval workflow fit the agent panel). - cate-control calls are now surfaced in the chat thread as compact, accent-tinted CateToolCards (icon + verb + summary, expandable to params/result) instead of being silent round-trips. Status tracks running → success / denied / error. - The guarded-mode ApprovalCard renders cate actions with the same icon + a human-readable request ("Let Cate run `npm test`?") rather than a raw `cate:<action>` name + JSON dump. - New cateToolDisplay maps (action, params) → { icon, verb, summary } and is shared by both the thread card and the approval card so they stay consistent.
|
Addressed both points: 1. Leaner, more agent-focused toolset (13 → 11).
2. Custom tool renderings + approval workflow.
Merged latest |
|
@PaulHorn — would appreciate your eyes on this one too when you have a moment 🙏 |
|
I'll check it later. 11 tools still seem like a lot of complexity / token usage. Will give you more detailed feedback then. Thanks for the update. |
Follow-up to review feedback on 0-AI-UG#211 (Anton-Horn: "11 tools still seem like a lot of complexity / token usage"). Collapses the surface the agent sees from 11 tools to 4, grouped by concept rather than per-verb: - cate_layout {op: get|arrange} — read the canvas / rearrange panels - cate_panel {op: open|focus|move|resize|close|preview} — single-panel lifecycle - cate_browser {panelId?, url} — navigate a browser panel (room to grow) - cate_terminal {op: run|read} — run a command / read output Implementation: thin op-routers (execLayout/execPanel/execBrowser/execTerminal) delegate to the same focused executors as before, so per-op behavior and the self-protection guards (won't close/move the host agent panel) are unchanged. classifyCateAction still escalates only destructive (close) and outbound (run a command, navigate/open a remote url) ops to guarded-mode approval. `arrange` moved out of panel into `layout` (it's a canvas-wide op, not per-panel); `navigate` moved into the new `browser` tool; `editor` preview stays a panel op (thin — can be split out symmetrically with browser later if it grows). cateToolDisplay + the thread/approval cards updated for the new actions. Shared protocol, extension tool defs, executors, e2e spec, and all unit tests migrated.
The extension is copied into each workspace's pi-agent extensions dir, where pi loads it at agent start. installCateControl used copyIfMissing (skip-if-exists), so once installed the copy never refreshed — after the toolset was consolidated the agent kept loading the OLD extension and emitted action names (open_panel, close_panel, …) the renderer dispatcher no longer handles, so every cate tool call failed with "Unknown or unimplemented action". This is also a latent prod bug: shipping a new extension version would never reach users who already had an older copy installed. Fix: copyIfChanged overwrites the installed copy whenever its bytes differ from the bundled source. The extension's action protocol is coupled to the renderer, so the bundled copy is authoritative — there's no user-customization to preserve. Adds a unit test (missing → write, differing → overwrite, identical → skip). Note: the e2e suite drives the renderer dispatcher directly (window.__cateE2E), bypassing the installed extension, which is why it didn't catch the skew.
|
@Anton-Horn — update on the complexity / token-usage point. Consolidated 11 → 4 tools. The agent now sees only:
Thin op-routers delegate to the same focused executors, so per-op behavior and the self-protection guards (won't close/move the agent's own host panel) are unchanged. Custom rendering carries over from the earlier round: cate actions render as compact accent cards in the thread (icon + verb + summary, expandable to params/result), and guarded side-effects show a human-readable “Let Cate run Also fixed a bug found while testing this: the extension is copied into each workspace's pi-agent dir, and the installer used skip-if-exists — so after the toolset changed, the agent kept loading the stale copy and emitted action names the renderer no longer handled (“Unknown or unimplemented action”). Switched to refresh-on-change (
|
…t/agent-controls-cate
PR 0-AI-UG#226 removed git, fileExplorer, projectList from PanelType and deleted createGit/createFileExplorer from AppStore. Remove them from cateExecutors OPENABLE list and execOpenPanel switch to fix typecheck.
…-control PR 0-AI-UG#226 dropped git/fileExplorer/projectList panels. Beyond the executor switch, clean up the rest of cate-control's references to them: - tool schema description no longer advertises git|fileExplorer as openable - cateToolDisplay drops their icon entries (+ unused GitBranch/TreeStructure imports) - tests drop the dead createGit/createFileExplorer mocks and the git example
|
I'm on it now. Would like to push some code and steer this pr myself. If that's fine with you. |
|
Ok, let's go |
Summary
Adds cate-control: the in-app Cate agent (pi) can now drive the workspace itself — open, arrange, focus/move/resize/close panels and interact with their contents (run a terminal command, navigate a browser, reveal a file at a line, toggle markdown preview) on the canvas it lives in. The agent operates the workspace (what's open, where it sits, what's running, where the camera looks); file edits stay on its normal
Edittool.12 new
cate_*tools, a per-chat Guarded/Auto toggle, and a global on/off setting.How it works
Transport. pi exposes no generic host RPC — only
ctx.ui.select/confirm/input. So a bundledcate-controlpi extension piggybacks a structured request onctx.ui.input()with a sentinel-prefixed JSON payload (@@cate-control@@{...}). The renderer intercepts the sentinel inagentStore.handleEventbefore the dialog queue, dispatches it, and replies over the existingagent:uiResponsechannel. Pure-visual actions (pan/zoom) are fire-and-forget.Dispatcher.
src/agent/renderer/cateControl.tsresolves the calling chat's workspace/canvas via a context registry thatAgentPanelpopulates (useCanvasStoreApi()), classifies the action (safe vs side-effect), applies the chat's mode policy, runs an executor (cateExecutors.ts), and returns a structured response.Modes (guarded / auto). A per-chat toggle in the agent chat footer (next to plan-mode), mirroring plan/auto in Claude Code:
A global
cateControlEnabledsetting gates the whole feature.Semantic placement. The agent never sends pixels — it states intent (
right of X,tile,relativeTo: 'self');cateControlLayout.ts(pure geometry, unit-tested) computes canvas-space rects.Self-protection.
cate_get_layoutmarks the agent's own host panelisSelf: true; executors refuse to close/move it and exclude it fromarrangeunless targeted explicitly by id. The feature is placement-agnostic so a future "global/operator agent" can land without rework.Tool surface (12
cate_*tools)cate_get_layoutcate_open_panel(editor/terminal/browser/git/fileExplorer/document),cate_close_panelcate_focus_panel,cate_move_panel,cate_resize_panel,cate_arrange(tile/grid/cascade/focus-one)cate_run_in_terminal,cate_open_url,cate_reveal_in_editor,cate_set_markdown_previewcate_pan_to,cate_zoomTwo fixes from live testing
createTerminal'sinitialInputis intentionally not persisted (it would re-run on session restore), soopen_panel(terminal, command)silently dropped the command;run_in_terminal's single 250 ms retry was too short for a freshnode-ptyto register. Both now route throughwriteToTerminalWhenReady, which polls the terminal registry until the PTY is live, then writes the command.EditorPanel→setPanelMarkdownPreview) but the agent had no tool. Addedcate_set_markdown_preview+ apreviewoption oncate_reveal_in_editor.Implementation notes
cate-plan-mode); shipped in prod via the existingextraResourcesglob.cateControlEnabledinSETTINGS_SCHEMA(src/main/store.ts) sotscstays clean.Testing
tsc --noEmitclean;npm run buildgreen.e2e/cate-control.spec.ts): drives the real renderer dispatcher in a live Electron window —run_in_terminalandopen_panel(terminal, command)actually execute in a spawned PTY (asserted via command output in the xterm buffer), andset_markdown_previewflips the editor into preview.npm run dev) — toggle renders + flips, panels open/arrange, terminal commands run, preview toggles.Future ideas (brief)
ctx.ui.customextensions as native panels — unblocks much of the pi package catalog (currently flagged "requires terminal").Test plan
npm run buildpassesnpm run testpasses (453)