Skip to content

fix(extension): activate tabs and focus window to fix background rendering failures#402

Open
ykswang wants to merge 1 commit intoalibaba:mainfrom
ykswang:fix/tab-activation-background-window
Open

fix(extension): activate tabs and focus window to fix background rendering failures#402
ykswang wants to merge 1 commit intoalibaba:mainfrom
ykswang:fix/tab-activation-background-window

Conversation

@ykswang
Copy link
Copy Markdown

@ykswang ykswang commented Apr 4, 2026

Summary

When Chrome is not the focused OS window, elementFromPoint() returns null for background tabs. This causes isTopElement() to incorrectly mark all elements as non-visible. The LLM then sees an empty page and enters a loop of opening new tabs instead of operating on the current page.

Root cause: switchToTab() only updated internal state (currentTabId in storage) but never called chrome.tabs.update(tabId, { active: true }). Combined with new tabs being created via chrome.tabs.create({ active: false }), the target tab always remained a background tab with throttled rendering.

Three-layer fix:

  • switchToTab() now activates the tab via chrome.tabs.update(tabId, { active: true }), ensuring the target tab gets full rendering priority from Chrome
  • focusWindow() added and called at task start via chrome.windows.update(windowId, { focused: true }) to bring Chrome to the foreground when automation begins (especially important for MCP-driven workflows)
  • isTopElement() graceful degradation — when all elementFromPoint() calls return null (indicating background window), optimistically assume elements are top-level instead of hiding them all

Files changed

  • packages/extension/src/agent/TabsController.tsswitchToTab() activates tab, new focusWindow() method, updated TabAction type
  • packages/extension/src/agent/TabsController.background.ts — new activate_tab and focus_window message handlers
  • packages/extension/src/agent/MultiPageAgent.ts — call focusWindow() in onBeforeTask
  • packages/page-controller/src/dom/dom_tree/index.jsisTopElement() fallback when elementFromPoint returns all nulls

Test plan

  • Run agent via MCP with Chrome not the focused window — verify elements are detected and actions execute on the existing tab (no spurious new tabs)
  • Run agent via MCP with Chrome focused — verify no regression
  • Run agent from sidepanel — verify no regression
  • Verify tab visually switches in Chrome when agent calls switchToTab()
  • Verify Chrome window comes to foreground when task starts via MCP

🤖 Generated with Claude Code

…ering failures

When Chrome is not the focused OS window, elementFromPoint() returns null
for background tabs, causing isTopElement() to incorrectly mark all
elements as non-visible. The LLM then sees an empty page and enters a
loop of opening new tabs.

Three-layer fix:
- switchToTab() now calls chrome.tabs.update(active: true) so the target
  tab gets full rendering priority instead of staying as a background tab
- focusWindow() added and called at task start via chrome.windows.update
  to bring Chrome to the foreground when automation begins via MCP
- isTopElement() now returns true when all elementFromPoint() calls yield
  null (background window fallback) instead of hiding every element

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 4, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants