Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions docs/PTY_AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# PTY agents — steering real CLIs (Stage C spike)

**Status: EXPERIMENTAL. Off by default.** This is the Stage-C spike from the agent
control-plane plan — the path toward commanding *real* `claude` / `codex` CLIs,
not just LISA's own managed agents.

## What it is

A **managed agent** (Phase 3) runs LISA's *own* agent loop — its tools, its
provider. A **PTY agent** instead spawns the **real `claude` / `codex` binary**
inside a pseudo-terminal (`node-pty`), so you get that CLI's full configuration —
its skills, MCP servers, hooks, model — while LISA owns stdin/stdout:

- types your task and any follow-ups into the CLI,
- can answer its prompts (you type into it from the roster),
- reads the terminal stream for a coarse live status + a viewable output tail.

In the GUI agents card these appear under their **real kind** (`claude-code` /
`codex`), marked controllable: a **type-into-the-CLI** box, a **▤ output** button
(shows the captured terminal tail in a modal), and **⏹ cancel**.

## Enabling it

1. Install the optional native dep (it has zero JS deps; if your machine can't
build it, nothing else in LISA is affected):
```sh
npm i node-pty
```
2. Turn the spike on:
```sh
LISA_PTY_AGENTS=1 lisa serve --web
```
3. In the agents card, pick `claude` or `codex` in the delegate picker, type a
task, hit ▶. (Without the flag the start endpoint returns `503` and the GUI
shows the hint.)

Binary resolution is env-overridable: `LISA_PTY_CLAUDE_CMD`, `LISA_PTY_CODEX_CMD`.

## Honest limits (why it's a flagged spike, not a shipped feature)

- **Only CLIs LISA spawns.** It cannot adopt a `claude`/`codex` session you
already opened in your own terminal — those have no control channel and stay
**observe-only**. (Commanding *those* would need Claude Code's undocumented,
version-locked `peerProtocol` — not attempted here.)
- **Best-effort output parsing.** The CLI's TUI is ANSI / box-drawn and
version-sensitive, so `state` is inferred from output *quiescence*
(streaming → working, quiet → waiting), not from parsed intent.
- **Native dep.** `node-pty` is an `optionalDependency`; installs and CI never
fail if it can't build — PTY agents are simply unavailable then.
- **Privacy.** A PTY agent captures the full terminal, including model replies.
That content is shown to **you** on demand (`/api/agents/pty/<id>/output`) and
is **never** folded into the structural cross-agent roster, which stays
metadata-only like every observer.

## Endpoints (all behind the standard loopback-or-token auth gate)

| Method + path | Body | Effect |
| --- | --- | --- |
| `POST /api/agents/pty/start` | `{ agent, task, cwd? }` | spawn a PTY agent (503 if flag off) |
| `POST /api/agents/pty/<id>/send` | `{ text }` | type a line into the CLI |
| `POST /api/agents/pty/<id>/cancel` | — | kill the CLI |
| `GET /api/agents/pty/<id>/output` | — | ANSI-stripped terminal tail |

## Code

- `src/agents/pty.ts` — `PtyAgent` + `PtyRegistry` (+ pure `stripAnsi` /
`derivePtyState` / `resolveCli`), dynamic `node-pty` import with graceful
fallback, flag gate.
- `src/integrations/pty/observer.ts` — surfaces PTY agents in the hub roster
(real kind, `controllable: "pty"`).
- `src/web/server.ts` — the endpoints above.
- GUI: delegate kind picker + `controllable`-family row controls in
`lisa-html.ts` / `lisa-client.ts` / `lisa-css.ts`.
- Tests: `src/agents/pty.test.ts` (pure helpers + lifecycle via an injected fake
pty; one real-`node-pty` round-trip that skips when the dep can't spawn).
26 changes: 22 additions & 4 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -76,5 +76,8 @@
"sharp": "^0.34.5",
"tsx": "^4.21.0",
"typescript": "^5.7.0"
},
"optionalDependencies": {
"node-pty": "^1.1.0"
}
}
195 changes: 195 additions & 0 deletions src/agents/pty.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
import { test } from "node:test";
import assert from "node:assert/strict";
import {
stripAnsi,
derivePtyState,
ptyEnabled,
resolveCli,
normalizeAgentKind,
PtyAgent,
PtyRegistry,
type IPtyLike,
type PtyModuleLike,
} from "./pty.js";

const ESC = String.fromCharCode(27);
const BEL = String.fromCharCode(7);

/** A fake node-pty: capture writes/kills, drive data/exit by hand. */
function fakePty() {
let dataCb: ((d: string) => void) | null = null;
let exitCb: ((e: { exitCode: number }) => void) | null = null;
const written: string[] = [];
let killed = false;
const proc: IPtyLike = {
onData(cb) {
dataCb = cb;
},
onExit(cb) {
exitCb = cb;
},
write(d) {
written.push(d);
},
kill() {
killed = true;
},
};
const module: PtyModuleLike = { spawn: () => proc };
return {
module,
written,
emitData: (s: string) => dataCb?.(s),
emitExit: (code: number) => exitCb?.({ exitCode: code }),
isKilled: () => killed,
};
}

async function withFlag<T>(fn: () => Promise<T> | T): Promise<T> {
const prev = process.env.LISA_PTY_AGENTS;
process.env.LISA_PTY_AGENTS = "1";
try {
return await fn();
} finally {
if (prev === undefined) delete process.env.LISA_PTY_AGENTS;
else process.env.LISA_PTY_AGENTS = prev;
}
}

// ── pure helpers ──

test("stripAnsi removes color, OSC-8 hyperlinks, and bare control bytes", () => {
const s =
ESC + "[31mred" + ESC + "[0m " + ESC + "]8;;http://example.com/x" + BEL + "link" + ESC + "]8;;" + BEL + " done" + ESC + "[2K";
assert.equal(stripAnsi(s), "red link done");
assert.equal(stripAnsi("a\rb\bc"), "abc");
assert.equal(stripAnsi("plain"), "plain");
});

test("derivePtyState: streaming → working, quiet → waiting", () => {
assert.equal(derivePtyState(1000, 2000, 4000), "working");
assert.equal(derivePtyState(1000, 4999, 4000), "working");
assert.equal(derivePtyState(1000, 5001, 4000), "waiting");
});

test("ptyEnabled reflects LISA_PTY_AGENTS", async () => {
const prev = process.env.LISA_PTY_AGENTS;
delete process.env.LISA_PTY_AGENTS;
assert.equal(ptyEnabled(), false);
await withFlag(() => assert.equal(ptyEnabled(), true));
if (prev !== undefined) process.env.LISA_PTY_AGENTS = prev;
});

test("resolveCli + normalizeAgentKind map agent kinds", () => {
assert.equal(resolveCli("claude"), "claude");
assert.equal(resolveCli("claude-code"), "claude");
assert.equal(resolveCli("codex"), "codex");
assert.equal(normalizeAgentKind("claude"), "claude-code");
assert.equal(normalizeAgentKind("claude-code"), "claude-code");
assert.equal(normalizeAgentKind("codex"), "codex");
});

// ── lifecycle (fake pty) ──

test("start is blocked unless the spike flag is on", async () => {
const prev = process.env.LISA_PTY_AGENTS;
delete process.env.LISA_PTY_AGENTS;
await assert.rejects(
() => PtyAgent.start({ agent: "claude", task: "x", cwd: "/tmp", ptyModule: fakePty().module }),
/disabled/,
);
if (prev !== undefined) process.env.LISA_PTY_AGENTS = prev;
});

test("start types the task; send appends; data drives state; cancel kills", async () => {
await withFlag(async () => {
const f = fakePty();
const clock = { t: 1000 };
const reg = new PtyRegistry();
const v = await reg.start({
agent: "claude",
task: "do the thing",
cwd: "/Users/me/myproj",
ptyModule: f.module,
now: () => clock.t,
});
// identity + initial task typed in
assert.equal(v.agent, "claude-code");
assert.equal(v.cli, "claude");
assert.equal(v.project, "myproj");
assert.equal(f.written[0], "do the thing\r");

// follow-up
assert.equal(reg.send(v.id, "also lint"), true);
assert.equal(f.written[1], "also lint\r");

// output capture (ANSI-stripped) + working while recent
clock.t = 2000;
f.emitData(ESC + "[32mhello" + ESC + "[0m");
assert.match(reg.output(v.id) ?? "", /hello/);
clock.t = 2500;
assert.equal(reg.list()[0].state, "working");
clock.t = 7000;
assert.equal(reg.list()[0].state, "waiting");

// cancel → killed + done; idempotent; no writes after
assert.equal(reg.cancel(v.id), true);
assert.equal(f.isKilled(), true);
const after = reg.list()[0];
assert.equal(after.state, "done");
assert.equal(after.stateReason, "cancelled");
const writes = f.written.length;
reg.send(v.id, "ignored");
assert.equal(f.written.length, writes);
reg.cancel(v.id); // idempotent, no throw
assert.equal(reg.list()[0].state, "done");
});
});

test("process exit marks the agent done", async () => {
await withFlag(async () => {
const f = fakePty();
const reg = new PtyRegistry();
const v = await reg.start({ agent: "codex", task: "go", cwd: "/tmp/p", ptyModule: f.module });
f.emitExit(0);
const view = reg.list()[0];
assert.equal(view.agent, "codex");
assert.equal(view.state, "done");
assert.equal(view.stateReason, "exit 0");
});
});

test("registry actions on an unknown id are no-ops", () => {
const reg = new PtyRegistry();
assert.equal(reg.send("nope", "x"), false);
assert.equal(reg.cancel("nope"), false);
assert.equal(reg.output("nope"), null);
assert.deepEqual(reg.list(), []);
});

// ── real node-pty round-trip (skipped if the optional dep isn't built) ──

test("real PTY round-trip via `cat` echoes input", async (t) => {
// `cat` under a PTY echoes typed input back on stdout — proves the real
// spawn → write → read → kill path without depending on a heavy CLI.
// Skips when node-pty isn't built OR its native binding can't spawn in this
// environment (e.g. under the tsx test loader, which resolves node-pty's TS
// source rather than its compiled native lib — a runner artifact, not a
// defect: the shipped path runs against compiled JS).
const reg = new PtyRegistry();
let v;
try {
await import("node-pty");
v = await withFlag(() =>
reg.start({ agent: "claude", task: "", cwd: process.cwd(), cli: "cat", args: [] }),
);
} catch (e) {
t.skip("node-pty unavailable here: " + (e as Error).message);
return;
}
reg.send(v.id, "ping-marker-42");
await new Promise((r) => setTimeout(r, 300));
assert.match(reg.output(v.id) ?? "", /ping-marker-42/);
reg.cancel(v.id);
assert.equal(reg.list()[0].state, "done");
});
Loading