Replace Playwright with Kernel native API in OpenAI CUA templates#124
Replace Playwright with Kernel native API in OpenAI CUA templates#124
Conversation
Both TypeScript and Python OpenAI CUA templates now use Kernel's native computer control API (screenshot, click, type, scroll, batch, etc.) instead of Playwright over CDP. This enables the batch_computer_actions tool which executes multiple actions in a single API call for lower latency. Key changes: - New KernelComputer class wrapping Kernel SDK for all computer actions - Added batch_computer_actions function tool with system instructions - Navigation (goto/back/forward) via Kernel's playwright.execute endpoint - Local test scripts create remote Kernel browsers without app deployment - Removed playwright-core, sharp (TS) and playwright (Python) dependencies - Bumped @onkernel/sdk to ^0.38.0 and kernel to >=0.38.0 Made-with: Cursor
| } | ||
|
|
||
| const currentUrl = await this.computer.getCurrentUrl(); | ||
| utils.checkBlocklistedUrl(currentUrl); |
There was a problem hiding this comment.
TypeScript URL blocklist check return value silently ignored
Medium Severity
checkBlocklistedUrl returns a boolean, but agent.ts discards the return value, making the URL blocklist entirely non-functional. The Python counterpart correctly raises a ValueError to halt execution. Previously, Playwright's route-level route.abort() handler provided actual network-level blocking, but that was removed in this PR, leaving no working URL blocking in the TypeScript template.
Additional Locations (1)
| from .kernel_computer import KernelComputer | ||
|
|
||
| computers_config = { | ||
| "local-playwright": LocalPlaywrightBrowser, |
| return "left" | ||
| if isinstance(button, int): | ||
| return {1: "left", 2: "middle", 3: "right"}.get(button, "left") | ||
| return str(button) |
There was a problem hiding this comment.
Missing handling for special click button values
Medium Severity
The CUA model can send click actions with button set to "back", "forward", or "wheel". The deleted Playwright code explicitly handled these by routing to self.back(), self.forward(), or mouse.wheel(). The new _normalize_button/normalizeButton functions pass these strings through unchanged to the Kernel click_mouse API, which only accepts "left", "right", or "middle" — causing an API error when the model uses these button types.
Additional Locations (1)
|
Bugbot Autofix prepared fixes for 3 of the 3 bugs found in the latest run.
Or push these changes by commenting: Preview (a9e2870223)diff --git a/pkg/templates/python/openai-computer-use/computers/config.py b/pkg/templates/python/openai-computer-use/computers/config.py
deleted file mode 100644
--- a/pkg/templates/python/openai-computer-use/computers/config.py
+++ /dev/null
@@ -1,5 +1,0 @@
-from .kernel_computer import KernelComputer
-
-computers_config = {
- "kernel": KernelComputer,
-}
\ No newline at end of file
diff --git a/pkg/templates/python/openai-computer-use/computers/kernel_computer.py b/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
--- a/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
+++ b/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
@@ -74,12 +74,22 @@
def _translate_cua_action(action: Dict[str, Any]) -> Dict[str, Any]:
action_type = action.get("type", "")
if action_type == "click":
+ button = action.get("button")
+ if button == "back":
+ return {"type": "press_key", "press_key": {"keys": ["Alt_L", "Left"]}}
+ if button == "forward":
+ return {"type": "press_key", "press_key": {"keys": ["Alt_L", "Right"]}}
+ if button == "wheel":
+ return {
+ "type": "scroll",
+ "scroll": {"x": action.get("x", 0), "y": action.get("y", 0), "delta_x": 0, "delta_y": 0},
+ }
return {
"type": "click_mouse",
"click_mouse": {
"x": action.get("x", 0),
"y": action.get("y", 0),
- "button": _normalize_button(action.get("button")),
+ "button": _normalize_button(button),
},
}
elif action_type == "double_click":
@@ -134,6 +144,15 @@
return base64.b64encode(resp.read()).decode("utf-8")
def click(self, x: int, y: int, button="left") -> None:
+ if button == "back":
+ self.back()
+ return
+ if button == "forward":
+ self.forward()
+ return
+ if button == "wheel":
+ self.scroll(x, y, 0, 0)
+ return
self.client.browsers.computer.click_mouse(self.session_id, x=x, y=y, button=_normalize_button(button))
def double_click(self, x: int, y: int) -> None:
diff --git a/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts b/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts
--- a/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts
+++ b/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts
@@ -105,11 +105,18 @@
function translateCuaAction(action: CuaAction): BatchAction {
switch (action.type) {
- case 'click':
+ case 'click': {
+ if (action.button === 'back')
+ return { type: 'press_key', press_key: { keys: ['Alt_L', 'Left'] } };
+ if (action.button === 'forward')
+ return { type: 'press_key', press_key: { keys: ['Alt_L', 'Right'] } };
+ if (action.button === 'wheel')
+ return { type: 'scroll', scroll: { x: action.x ?? 0, y: action.y ?? 0, delta_x: 0, delta_y: 0 } };
return {
type: 'click_mouse',
click_mouse: { x: action.x ?? 0, y: action.y ?? 0, button: normalizeButton(action.button) },
};
+ }
case 'double_click':
return {
type: 'click_mouse',
@@ -168,6 +175,9 @@
}
async click(x: number, y: number, button: string | number = 'left'): Promise<void> {
+ if (button === 'back') { await this.back(); return; }
+ if (button === 'forward') { await this.forward(); return; }
+ if (button === 'wheel') { await this.scroll(x, y, 0, 0); return; }
await this.client.browsers.computer.clickMouse(this.sessionId, {
x,
y,
diff --git a/pkg/templates/typescript/openai-computer-use/lib/utils.ts b/pkg/templates/typescript/openai-computer-use/lib/utils.ts
--- a/pkg/templates/typescript/openai-computer-use/lib/utils.ts
+++ b/pkg/templates/typescript/openai-computer-use/lib/utils.ts
@@ -40,12 +40,14 @@
}
}
-export function checkBlocklistedUrl(url: string): boolean {
+export function checkBlocklistedUrl(url: string): void {
try {
const host = new URL(url).hostname;
- return BLOCKED_DOMAINS.some((d) => host === d || host.endsWith(`.${d}`));
- } catch {
- return false;
+ if (BLOCKED_DOMAINS.some((d) => host === d || host.endsWith(`.${d}`))) {
+ throw new Error(`Blocked URL: ${url}`);
+ }
+ } catch (e) {
+ if (e instanceof Error && e.message.startsWith('Blocked URL:')) throw e;
}
} |



Summary
batch_computer_actionsfunction tool that executes multiple browser actions in a single API call, reducing latencytest.local.ts/test_local.py) that create remote Kernel browsers for testing without deploying a Kernel appDetails
New
KernelComputerclass (TS + Python) wraps the Kernel SDK for all computer actions:captureScreenshot,clickMouse,typeText,pressKey,scroll,moveMouse,dragMousebatchendpoint for batched actionsplaywright.executefor navigation (goto,back,forward,getCurrentUrl)1/2/3in batch calls)Batch tool: System instructions guide the model to prefer
batch_computer_actionsfor predictable sequences (e.g., click + type + enter).Removed dependencies:
playwright-core,sharp(TS),playwright(Python). Bumped@onkernel/sdkto^0.38.0andkernelto>=0.38.0.Test plan
test.local.tsE2E: created remote Kernel browser, ran CUA agent (eBay search task), batch tool used successfully, browser cleaned uptest_local.pyE2E: same test, batch tool used on first action (type + enter), agent completed successfullytsc --noEmit)Made with Cursor
Note
Medium Risk
Replaces the core browser-automation layer and action execution path in both templates, which can change interaction timing/behavior and screenshot handling. Risk is mitigated by being template/sample code and by keeping URL blocklist/safety-check handling in place.
Overview
Updates both the Python and TypeScript OpenAI CUA templates to stop using Playwright-over-CDP and instead drive browsers via Kernel’s native computer control endpoints (screenshot/click/type/scroll/batch), implemented through new
KernelComputerwrappers.Introduces a new
batch_computer_actionsfunction tool plus model instructions to prefer batching predictable action sequences, reducing round trips; agents are updated to execute batched actions and return a single post-batch screenshot.Cleans up template dependencies and tooling (removes Playwright/Sharp/Pillow usage, bumps Kernel SDK versions, adds
KERNEL_API_KEYenv var), and adds local test scripts (test_local.py,test.local.ts) for running against a remote Kernel browser without deploying.Written by Cursor Bugbot for commit 415546a. This will update automatically on new commits. Configure here.