You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/architecture/browser-broker.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,8 +19,8 @@ When `OPENCLAW_RUNTIME_BROWSER_BROKER=cdp` and `OPENCLAW_SPARSEKERNEL_BROWSER_CD
19
19
20
20
Set `OPENCLAW_RUNTIME_BROWSER_BROKER=native` to let SparseKernel launch and supervise a local Chromium-compatible process pool by trust zone and profile. The native pool uses a loopback-only remote debugging endpoint, a runtime-owned browser profile directory, and pooled process refcounts; the leased CDP context is released first, then the browser process is stopped after the pool idle timeout. Use `OPENCLAW_SPARSEKERNEL_BROWSER_EXECUTABLE` when Chrome/Chromium is not discoverable on `PATH` or a common platform path. Headless mode is on by default. `OPENCLAW_SPARSEKERNEL_BROWSER_NO_SANDBOX=1` is an explicit opt-out and should only be used when the host environment cannot run Chromium's sandbox.
21
21
22
-
Supported v0 actions (`status`, `doctor`, `profiles`, `tabs`, `open`, `navigate`, `focus`, `close`, `snapshot`, `console`, `screenshot`, `pdf`, direct file-input `upload`, `dialog`, and brokered `act`) operate against the leased CDP context. Brokered `act` covers the OpenClaw action contract for click, coordinate click, type, press, hover, scroll, drag, select, fill, resize, wait, evaluate, close, and batch using CDP input events plus bounded DOM evaluation. Selector-backed actions retry inside the leased page until their action timeout, and `wait --load networkidle` uses CDP Network events plus a quiet window rather than only checking `document.readyState`. Snapshots use a bounded CDP `Runtime.evaluate` DOM read, actions resolve refs from the latest brokered snapshot where needed, console output is captured from CDP runtime/log events, and screenshot/PDF output is captured as SparseKernel artifacts, read back through artifact access, and converted to existing tool result formats for compatibility. The context is retained for the active embedded run and released during broker cleanup, not opened and closed for every browser tool call.
22
+
Supported v0 actions (`status`, `doctor`, `profiles`, `tabs`, `open`, `navigate`, `focus`, `close`, `snapshot`, `console`, `screenshot`, `pdf`, direct file-input `upload`, `dialog`, and brokered `act`) operate against the leased CDP context. Brokered `act` covers the OpenClaw action contract for click, coordinate click, type, press, hover, scroll, drag, select, fill, resize, wait, evaluate, close, and batch using CDP input events plus bounded DOM evaluation. Selector-backed actions retry inside the leased page until their action timeout, and `wait --load networkidle` uses CDP Network events plus a quiet window rather than only checking `document.readyState`. Actions that can change page state are followed by a broker-side navigation check: same-target navigations are accepted only when the resulting URL stays inside the context's allowed-origin policy, while new tabs/windows are closed and rejected in v0 because they are unleased targets. Snapshots use a bounded CDP `Runtime.evaluate` DOM read, actions resolve refs from the latest brokered snapshot where needed, console output is captured from CDP runtime/log events, and screenshot/PDF output is captured as SparseKernel artifacts, read back through artifact access, and converted to existing tool result formats for compatibility. The context is retained for the active embedded run and released during broker cleanup, not opened and closed for every browser tool call.
23
23
24
24
BrowserContext isolation is session isolation, not host isolation. Playwright route blocking is useful request control, not a hard security boundary.
25
25
26
-
The brokered CDP action engine is intentionally implemented inside the broker boundary, but it is not a byte-for-byte Playwright clone. Behaviors that depend on Playwright's full actionability checks, locator retry model, navigation guard, or network-idle semantics should still be covered by targeted tests before relying on them for critical automations. Set `OPENCLAW_SPARSEKERNEL_BROWSER_LIVE=1` when running `src/local-kernel/browser-cdp-live.test.ts` to exercise the native pool and brokered actions against a real local Chromium-compatible browser.
26
+
The brokered CDP action engine is intentionally implemented inside the broker boundary, but it is not a byte-for-byte Playwright clone. Behaviors that depend on Playwright's full actionability checks, locator retry model, popup ownership, or network-idle semantics should still be covered by targeted tests before relying on them for critical automations. Set `OPENCLAW_SPARSEKERNEL_BROWSER_LIVE=1` when running `src/local-kernel/browser-cdp-live.test.ts` to exercise the native pool and brokered actions against a real local Chromium-compatible browser.
Copy file name to clipboardExpand all lines: docs/architecture/local-agent-kernel.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -96,7 +96,7 @@ The browser broker model is:
96
96
97
97
Important boundary: BrowserContext isolation is session isolation, not host isolation. Playwright route blocking and SSRF guards are useful controls, but they are not hard security boundaries.
98
98
99
-
The broker applies configured trust-zone network policy to explicit allowed origins before allocating a context. This is an egress guard for brokered contexts, not a kernel or VM boundary. Set `OPENCLAW_RUNTIME_BROWSER_BROKER=cdp` and `OPENCLAW_SPARSEKERNEL_BROWSER_CDP_ENDPOINT=<loopback endpoint>` to make the OpenClaw browser tool acquire a real SparseKernel CDP context for the active run. Set `OPENCLAW_RUNTIME_BROWSER_BROKER=managed` to use the existing OpenClaw browser control service as the managed process owner and let SparseKernel lease CDP contexts from its reported endpoint. Set `OPENCLAW_RUNTIME_BROWSER_BROKER=native` to let SparseKernel launch and supervise a local Chromium-compatible process pool keyed by trust zone/profile, with process lifetime tied to brokered context leases and idle timeout. The runtime injects an internal browser proxy for supported navigation, tab, snapshot, console, screenshot, PDF, direct file-input upload, dialog, and action routes instead of exposing raw CDP to the agent. Brokered actions cover the OpenClaw action contract with CDP input events, bounded DOM evaluation, selector retry, and CDP-backed network-idle waiting; screenshot and PDF outputs go through the artifact store.
99
+
The broker applies configured trust-zone network policy to explicit allowed origins before allocating a context. This is an egress guard for brokered contexts, not a kernel or VM boundary. Set `OPENCLAW_RUNTIME_BROWSER_BROKER=cdp` and `OPENCLAW_SPARSEKERNEL_BROWSER_CDP_ENDPOINT=<loopback endpoint>` to make the OpenClaw browser tool acquire a real SparseKernel CDP context for the active run. Set `OPENCLAW_RUNTIME_BROWSER_BROKER=managed` to use the existing OpenClaw browser control service as the managed process owner and let SparseKernel lease CDP contexts from its reported endpoint. Set `OPENCLAW_RUNTIME_BROWSER_BROKER=native` to let SparseKernel launch and supervise a local Chromium-compatible process pool keyed by trust zone/profile, with process lifetime tied to brokered context leases and idle timeout. The runtime injects an internal browser proxy for supported navigation, tab, snapshot, console, screenshot, PDF, direct file-input upload, dialog, and action routes instead of exposing raw CDP to the agent. Brokered actions cover the OpenClaw action contract with CDP input events, bounded DOM evaluation, selector retry, CDP-backed network-idle waiting, and post-action navigation checks. Same-target action navigations must stay inside the context's allowed origins when a policy is configured; new tabs/windows are closed and rejected until the broker owns a multi-tab lease model. Screenshot and PDF outputs go through the artifact store.
100
100
101
101
## Sandbox broker
102
102
@@ -138,7 +138,7 @@ Session metadata writes are mirrored into the runtime ledger by default. Set `OP
138
138
## Current limitations
139
139
140
140
- Full transcript ownership is still staged: session metadata can run with SQLite as primary, assistant transcript mirrors and import/export are ledger-backed, and the pi session manager still appends JSONL.
141
-
- Native browser process pooling exists for loopback CDP contexts, but it is still a small supervisor around Chromium. Brokered actions now cover selector retry and CDP-backed network-idle waits, but the engine is still not a full Playwright-equivalent process manager and does not provide host isolation.
141
+
- Native browser process pooling exists for loopback CDP contexts, but it is still a small supervisor around Chromium. Brokered actions now cover selector retry, CDP-backed network-idle waits, and post-action navigation checks, but the engine is still not a full Playwright-equivalent process manager and does not provide host isolation.
142
142
- Sandbox allocation records requested bwrap/minijail/Docker backends and checks availability, but local/no-isolation still does not harden execution.
143
143
- Native plugins still execute in process; tool invocation is brokered and audited but untrusted plugin isolation is a later process-boundary change.
144
144
- Network policy enforcement currently exists where brokers call it; host-level egress proxying is still future work.
0 commit comments