Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
392afbc
feat(electron): add Electron platform as a third Argent target
latekvo May 20, 2026
5a6ab5e
feat(electron): address verify-agent findings
latekvo May 20, 2026
1084596
feat(electron-server): TypeScript abstraction layer mirroring sim-server
latekvo May 20, 2026
51b2a15
feat(electron-server): wire the per-device WebSocket upgrade handler
latekvo May 20, 2026
578f4e1
feat(electron): port debugger tools to direct CDP; gate RN-only tools
latekvo May 22, 2026
f663cca
fix(electron): address verify-swarm findings on debugger CDP port
latekvo May 22, 2026
e312a81
fix(electron): handle spawn 'error' event in boot-electron
latekvo May 22, 2026
ce51a60
fix(electron): address verify-swarm-v2 findings
latekvo May 22, 2026
232be30
fix(electron): detach boot listeners after success — child outlives t…
latekvo May 22, 2026
45e903e
test(electron): cover failure-path listener detach + document invariant
latekvo May 22, 2026
e3af485
Merge branch 'main' into worktree-electron-support
filip131311 Jun 12, 2026
d57ed69
style: run prettier
filip131311 Jun 12, 2026
82079d5
ci(wayland-e2e): read screenshot path from artifact handle
filip131311 Jun 12, 2026
717280d
fix(electron): make gesture-swipe scroll via wheel deltas by default
filip131311 Jun 12, 2026
98d3538
fix(electron): persist tracked CDP ports across tool-server restarts
filip131311 Jun 12, 2026
5285df4
feat(electron): dedicated gesture-scroll tool; gesture-swipe is touch…
filip131311 Jun 12, 2026
6da396d
feat(electron): add gesture-drag tool (electron-only)
filip131311 Jun 12, 2026
29db7fa
Merge branch 'main' into worktree-electron-support
filip131311 Jun 12, 2026
25d81ae
docs(electron): surface Electron support in agent-facing discovery la…
filip131311 Jun 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/workflows/wayland-e2e.yml
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,10 @@ jobs:
curl -sS -m 60 -X POST http://127.0.0.1:3033/tools/screenshot \
-H 'Content-Type: application/json' \
-d '{"udid":"emulator-5554"}' > /tmp/shot.json
PATH_PNG=$(python3 -c "import json,sys;print(json.load(open('/tmp/shot.json'))['data']['path'])")
# The screenshot tool returns an ArtifactHandle ({ image: { hostPath, ... } })
# since the remote-artifacts change; the job runs co-located with the
# tool-server so hostPath is directly readable.
PATH_PNG=$(python3 -c "import json,sys;print(json.load(open('/tmp/shot.json'))['data']['image']['hostPath'])")
cp "$PATH_PNG" /tmp/wayland-cold-boot.png
SZ=$(stat -c%s /tmp/wayland-cold-boot.png)
echo "size=${SZ}B"
Expand Down
4 changes: 4 additions & 0 deletions packages/argent-mcp/src/auto-screenshot.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import { isFlagEnabled, type FlagsPathOptions } from "@argent/configuration-core
export const AUTO_SCREENSHOT_TOOLS = new Set([
"gesture-tap",
"gesture-swipe",
"gesture-scroll",
"gesture-drag",
"gesture-custom",
"gesture-pinch",
"gesture-rotate",
Expand All @@ -33,6 +35,8 @@ export const AUTO_SCREENSHOT_DELAY_MS_BY_TOOL: Record<string, number> = {
"restart-app": 3000,
"open-url": 2000,
"gesture-swipe": 1500,
"gesture-scroll": 1500,
"gesture-drag": 1500,
"gesture-custom": 1500,
"gesture-tap": 1500,
"gesture-pinch": 1500,
Expand Down
2 changes: 1 addition & 1 deletion packages/argent-mcp/src/mcp-server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@ export async function startMcpServer(options: StartMcpServerOptions): Promise<vo
{
capabilities: { tools: {} },
instructions:
"Argent — iOS Simulator and Android Emulator control for interacting, testing, profiling and debugging mobile applications. " +
"Argent — iOS Simulator, Android Emulator, and Electron app control for interacting, testing, profiling and debugging mobile and Electron applications. " +
"Always use discovery tools (describe / debugger-component-tree / screenshot) before tapping — never guess coordinates. " +
"On session end: call stop-all-simulator-servers and perform any necessary cleanup. " +
"Full guidance is in the argent rule loaded from .claude/rules/argent.md.",
Expand Down
7 changes: 5 additions & 2 deletions packages/registry/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,9 @@ export interface ToolContext extends InvokeToolOptions {

// ── Device + Capability Types ──

export type Platform = "ios" | "android";
export type Platform = "ios" | "android" | "electron";

export type DeviceKind = "simulator" | "emulator" | "device" | "unknown";
export type DeviceKind = "simulator" | "emulator" | "device" | "app" | "unknown";

/**
* Universal device handle. Platform-aware tools resolve a `udid` parameter into
Expand Down Expand Up @@ -119,6 +119,9 @@ export interface ToolCapability {
device?: boolean;
unknown?: boolean;
};
electron?: {
app?: boolean;
};
/** Optional refiner. Returns true if this device is supported. */
supports?: (device: DeviceInfo) => boolean;
}
Expand Down
5 changes: 3 additions & 2 deletions packages/skills/rules/argent.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ alwaysApply: true
---

<description>
Argent MCP tools are available in this project for iOS simulator and Android emulator control. Argent MCP tools are the preferred form of interaction with the application.
Argent MCP tools are available in this project for iOS simulator, Android emulator, and Electron desktop app control. Argent MCP tools are the preferred form of interaction with the application.
Running MCP server and managing the Argent toolkit utilises `argent` command - if asked use `argent --help` for reference.
To check current version of MCP server run `argent --version` command.

Expand All @@ -17,6 +17,7 @@ Use cases:
- Any request to execute manual QA, UI QA, or visual behavior validation for a mobile app
- Running, debugging, or testing a React Native app (iOS or Android)
- Profiling performance or diagnosing re-renders in a React Native app (iOS or Android)
- Running, debugging, or testing an Electron desktop app (boot with `boot-device` + `electronAppPath`; on Electron scroll with `gesture-scroll` and drag with `gesture-drag` — `gesture-swipe` is touch-only)
</description>

<tapping_rule>
Expand Down Expand Up @@ -44,7 +45,7 @@ Before booting, running, or interacting with any app, call `list-devices` first
Decision order:

1. **Explicit user intent** - choose the user named platform or device. Look for words "simulator" and "emulator".
2. **Prefer a running device.** iOS simulators - state `Booted` and Android devices - `state: "device"` come first in `list-devices`.
2. **Prefer a running device.** iOS simulators - state `Booted` and Android devices - `state: "device"` come first in `list-devices`; Electron apps appear as `platform: "electron"`, `state: "Running"`.
3. **Single-platform project:** (per `argent-environment-inspector` flags `is_native_ios`/`is_native_android`, or RN with only one platform configured) → boot that platform.
</device_selection_rule>

Expand Down
58 changes: 30 additions & 28 deletions packages/skills/skills/argent-device-interact/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
---
name: argent-device-interact
description: Interact with an iOS simulator or Android emulator using argent MCP tools. Use when tapping UI elements, performing gestures, scrolling/swiping, typing text, pressing hardware buttons, launching apps, opening URLs, taking screenshots, or checking visible app state after interactions.
description: Interact with an iOS simulator, Android emulator, or Electron app using argent MCP tools. Use when tapping UI elements, performing gestures, scrolling/swiping, typing text, pressing hardware buttons, launching apps, opening URLs, taking screenshots, or checking visible app state after interactions.
---

## Unified tool surface

All interaction tools below accept a `udid` parameter and auto-dispatch iOS vs Android based on its shape (UUID → iOS simulator, anything else → Android adb serial). You use the same tool names on both platforms.
All interaction tools below accept a `udid` parameter and auto-dispatch iOS vs Android based on its shape (UUID → iOS simulator, `electron-cdp-<port>` → Electron app, anything else → Android adb serial). You use the same tool names on both platforms.

For platform-specific caveats (Metro `adb reverse`, locked-screen describe errors, etc.), see § 9 Platform-specific notes at the bottom.

## 1. Before You Start

If you delegate simulator tasks to sub-agents, make sure they have MCP permissions.

Use `list-devices` to get a target id. Results are tagged with `platform` (`ios` or `android`); booted/ready devices come first. Pick the first entry that matches the platform you need — if none are ready, call `boot-device` with `udid` (iOS) or `avdName` (Android). See `argent-ios-simulator-setup` / `argent-android-emulator-setup` for full setup flow.
Use `list-devices` to get a target id. Results are tagged with `platform` (`ios`, `android`, or `electron`); booted/ready devices come first. Pick the first entry that matches the platform you need — if none are ready, call `boot-device` with `udid` (iOS), `avdName` (Android), or `electronAppPath` (Electron). See `argent-ios-simulator-setup` / `argent-android-emulator-setup` for full setup flow.

**Load tool schemas before first use.** Gesture tools (`gesture-tap`, `gesture-swipe`, `gesture-pinch`, `gesture-rotate`, `gesture-custom`) may be deferred — their parameter schemas are not loaded until fetched. Always use ToolSearch to load the schemas of all gesture tools you plan to use **before** calling any of them. If you skip this step, parameters may be coerced to strings instead of numbers, causing validation errors.

## 2. Best Practices

1. **Always refer to tapping_rule** from your argent.md rule before tapping.
2. Before performing interactions, consider whether they can be **dispatched sequentially** - more on that in `run-sequence`.
3. **Use `gesture-swipe` for lists/scrolling**, not `gesture-custom`, unless you need non-linear movement. Consider whether you need multiple swipes, if yes - use `run-sequence`.
3. **Use `gesture-swipe` for lists/scrolling**, not `gesture-custom`, unless you need non-linear movement. On Electron use `gesture-scroll` instead — `gesture-swipe` is touch-only. Consider whether you need multiple swipes, if yes - use `run-sequence`.
4. **Tap a text field before typing** — on iOS try `paste` first then fall back to `keyboard`; on Android use `keyboard` directly (`paste` is iOS-only).
5. **Coordinates are normalized** — always 0.0–1.0, not pixels.
6. **For app navigation, prefer `describe` first.** It works on any screen without app restart. Do not navigate from screenshots on regular in-app screens unless `describe` failed to expose a reliable target. Use `native-describe-screen` only when you need app-scoped UIKit properties.
Expand All @@ -48,35 +48,37 @@ Common schemes: `messages://`, `settings://`, `maps://?q=<query>`, `tel://<numbe

## 4. Choosing the Right Tool

| Action | Tool | Notes |
| ---------------- | ---------------- | ---------------------------------------------------------------------- |
| Multiple actions | `run-sequence` | Batch steps in one call (no intermediate screenshots) |
| Open an app | `launch-app` | **Always — never tap home-screen icons** |
| Restart an app | `restart-app` | Terminate and relaunch by bundle ID |
| Open URL/scheme | `open-url` | Web pages, deep links, URL schemes |
| Single tap | `gesture-tap` | Buttons, links, checkboxes |
| Scroll/swipe | `gesture-swipe` | Straight-line scroll or swipe |
| Long press | `gesture-custom` | Context menus, drag start |
| Drag & drop | `gesture-custom` | Complex drag interactions |
| Pinch/zoom | `gesture-pinch` | Two-finger pinch with auto-interpolation |
| Rotation | `gesture-rotate` | Two-finger rotation with auto-interpolation |
| Custom gesture | `gesture-custom` | Arbitrary touch sequences, optional interpolation |
| Hardware key | `button` | Home, back, power, volume, appSwitch, actionButton |
| Type text (fast) | `paste` | iOS only. Form fields — uses clipboard |
| Type text | `keyboard` | iOS+Android. Fallback when paste fails; supports Enter, Escape, arrows |
| Rotate device | `rotate` | Orientation changes |
| Action | Tool | Notes |
| ----------------- | ---------------- | ---------------------------------------------------------------------- |
| Multiple actions | `run-sequence` | Batch steps in one call (no intermediate screenshots) |
| Open an app | `launch-app` | **Always — never tap home-screen icons** |
| Restart an app | `restart-app` | Terminate and relaunch by bundle ID |
| Open URL/scheme | `open-url` | Web pages, deep links, URL schemes |
| Single tap | `gesture-tap` | Buttons, links, checkboxes |
| Scroll/swipe | `gesture-swipe` | Straight-line scroll or swipe |
| Scroll (Electron) | `gesture-scroll` | Wheel-based; deltas are window fractions, positive deltaY = down |
| Drag (Electron) | `gesture-drag` | Sliders, drag-and-drop, text selection |
| Long press | `gesture-custom` | Context menus, drag start |
| Drag & drop | `gesture-custom` | Complex drag interactions |
| Pinch/zoom | `gesture-pinch` | Two-finger pinch with auto-interpolation |
| Rotation | `gesture-rotate` | Two-finger rotation with auto-interpolation |
| Custom gesture | `gesture-custom` | Arbitrary touch sequences, optional interpolation |
| Hardware key | `button` | Home, back, power, volume, appSwitch, actionButton |
| Type text (fast) | `paste` | iOS only. Form fields — uses clipboard |
| Type text | `keyboard` | iOS+Android. Fallback when paste fails; supports Enter, Escape, arrows |
| Rotate device | `rotate` | Orientation changes |

## 5. Finding Tap Targets

IMPORTANT. When moved to a different screen after an action or do not know the coordinates of component, **always** perform proper discovery first.

| App type | Discovery tool | What it returns |
| --------------------------------- | ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Target app discovery | `describe` | Accessibility element tree for the current device screen (iOS AX-service or Android uiautomator) with normalized frame coordinates. Works on any app, system dialogs, and Home screen — no app restart or `bundleId` required |
| React Native | `debugger-component-tree` | React component tree with names, text, testID, and (tap: x,y) |
| App-scoped native | `native-describe-screen` | Low-level app-scoped accessibility elements with normalized and raw coordinates; requires `bundleId` |
| Permission / system modal overlay | `describe` | `describe` detects system dialogs automatically and returns dialog buttons with tap coordinates. Fall back to `screenshot` only if `describe` does not expose the controls |
| Final visual fallback | `screenshot` | Use only when discovery tools cannot inspect the current UI reliably. Do not derive routine in-app navigation targets from screenshots |
| App type | Discovery tool | What it returns |
| --------------------------------- | ------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Target app discovery | `describe` | Accessibility element tree for the current device screen (iOS AX-service, Android uiautomator, or Electron DOM walker) with normalized frame coordinates. Works on any app, system dialogs, and Home screen — no app restart or `bundleId` required |
| React Native | `debugger-component-tree` | React component tree with names, text, testID, and (tap: x,y) |
| App-scoped native | `native-describe-screen` | Low-level app-scoped accessibility elements with normalized and raw coordinates; requires `bundleId` |
| Permission / system modal overlay | `describe` | `describe` detects system dialogs automatically and returns dialog buttons with tap coordinates. Fall back to `screenshot` only if `describe` does not expose the controls |
| Final visual fallback | `screenshot` | Use only when discovery tools cannot inspect the current UI reliably. Do not derive routine in-app navigation targets from screenshots |

Point follow-up native diagnostics after you already have a candidate point:

Expand Down
Loading
Loading