Skip to content

fix(correctness): tab matching, completion regex, version sync, COMET_PORT#12

Merged
RapierCraft merged 2 commits into
RapierCraft:mainfrom
sergei-aronsen:fix/correctness
May 16, 2026
Merged

fix(correctness): tab matching, completion regex, version sync, COMET_PORT#12
RapierCraft merged 2 commits into
RapierCraft:mainfrom
sergei-aronsen:fix/correctness

Conversation

@sergei-aronsen
Copy link
Copy Markdown

Summary

A bundle of small, independent correctness fixes uncovered while auditing the repo. Happy to split if reviewer prefers — each commit-level fix is self-contained.

page-scripts.ts — completion detection

Issue Symptom Fix
Russian singular Выполнен 1 шаг was not matched 1-step Russian tasks fall through to slow stability polling (~90s) Regex now matches Выполнен(?:о|ы)?
body.includes("Finished") matched the intermediate step Finished reading sources Agent flips to completed mid-task \bFinished\b(?!\s+(?:reading|analyzing|browsing|searching|loading))
indexOf returned the OLDEST steps completed / Reviewed N sources marker in multi-turn chats Extractor returned the previous-turn answer for the current question lastIndexOf

cdp-client.ts — tab matching

  • findTabByDomain now matches exact or subdomain (e.g. github.com matches gist.github.com but not notgithub.com). The old includes heuristic matched \"ai\" against every *.ai tab plus anything containing the substring.
  • listTargets on native Windows wraps CDP({...}) in try { … } finally { close() }. If Target.getTargets() threw the underlying WebSocket leaked, and `withAutoReconnect` retried -> compounding leaks.

cdp-client.ts / index.ts — COMET_PORT env

The README documents this env var, but the constant was hardcoded and call sites passed the literal `9223`. `DEFAULT_PORT` now reads from the env (validated as `[1, 65535]`, falls back to 9223 with a warning), and call sites use it.

index.ts — MCP version

The server advertised `"2.5.0"` while `package.json` shipped `2.6.2`. Now read at startup from `package.json` so the handshake stays in sync with each release.

tests/unit/page-scripts.test.ts

Four new cases:

  • singular Выполнен 1 шагcompleted
  • plural Выполнено 5 шаговcompleted
  • Finished reading sources + visible stop button → working (previous behaviour: completed)
  • multi-turn N steps completed: response comes from the LAST marker, not the first

DOM is built with `createElement` / `textContent` rather than `innerHTML` to keep the test fixtures linter-clean.

Compatibility

No public API changes. Tool argument shapes unchanged. Existing English UI behaviour unchanged.

Test plan

  • `npm run test:unit` passes (new cases included)
  • Manual: `COMET_PORT=9333 comet_connect` now actually uses 9333
  • Manual: `comet_tabs action=close domain=ai` no longer matches arbitrary tabs
  • Manual: agent completion detection on Russian Perplexity UI for 1-step tasks
  • Manual: in a multi-turn chat, second `comet_ask` returns the new answer, not the previous turn's

Related

This PR is part of a triage I did against `main` today. Companion PRs:

  • security hardening of `Runtime.evaluate` / PowerShell interpolation (separate)
  • drop unused `@anthropic-ai/sdk` dep (separate)

🤖 Generated with Claude Code

…_PORT

A bundle of small correctness fixes uncovered while auditing the repo.
Each fix is independent; happy to split if reviewer prefers.

### page-scripts.ts — completion detection

* `Выполнен(?:о|ы)?` — match Russian singular `Выполнен 1 шаг` in
  addition to the plural cases the original regex covered. Russian
  agrees the past-tense participle with grammatical number, so the
  singular form was being missed on Russian-localised Perplexity
  accounts.
* `\bFinished\b(?!\s+(?:reading|analyzing|browsing|searching|loading))`
  — word-boundary the English "Finished" marker and exclude the
  intermediate "Finished reading … " step Perplexity renders mid-task.
  Without this the agent flips to `completed` the moment the first
  source is processed.
* Strategy 1 and Strategy 2 in `extractAgentStatus` now use
  `lastIndexOf` for the `steps completed` / `Reviewed N sources`
  markers. In multi-turn chats the previous-turn markers stay in the
  scroll buffer; `indexOf` returned the OLDEST match, causing the
  extractor to return the previous turn's answer for the current
  question.

### cdp-client.ts — tab matching

* `findTabByDomain` now matches "exact or subdomain" instead of the
  ambiguous `includes` / reverse-`includes` heuristic. Search
  `"github.com"` matches `github.com` and `gist.github.com` but no
  longer matches `notgithub.com`. Search `"ai"` no longer matches
  every `*.ai` tab plus anything containing the substring `ai`.
* `listTargets` on native Windows now closes its `CDP({...})` client
  in a `finally` block. If `Target.getTargets()` threw the underlying
  WebSocket leaked; with `withAutoReconnect` retrying, leaks
  compounded.

### cdp-client.ts / index.ts — `COMET_PORT` env

The README documents `COMET_PORT` but the constant was hardcoded and
call sites passed the literal `9223` directly. `DEFAULT_PORT` now
reads from the env var (validated as `[1, 65535]`, falls back to
9223 with a warning), and call sites use `DEFAULT_PORT`.

### index.ts — MCP version

`new Server({ name: "comet-bridge", version: "2.5.0" }, ...)` advertised
2.5.0 to clients while `package.json` shipped 2.6.2. Now read at startup
via `readFileSync(packageJson)` so the handshake matches the published
version automatically.

### tests/unit/page-scripts.test.ts

Four new cases:

* singular `Выполнен 1 шаг` -> `completed`
* plural   `Выполнено 5 шагов` -> `completed`
* `Finished reading sources` with a visible stop button -> `working`
  (the previous behaviour would have flipped it to `completed`)
* multi-turn `N steps completed`: the response comes from the LAST
  marker, not the first

Tests build DOM with `createElement`/`textContent` to keep the linter
happy and to avoid any innerHTML in test fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…turn

CI failed on the previous commit's "picks the response after the LAST
'steps completed' marker" test. The bug:

  - `bodyText.match(/(\d+)\s*steps?\s*completed/i)` returns only the
    FIRST match (no /g flag).
  - In real multi-turn chats the marker text differs across turns:
    "3 steps completed" for turn 1, "5 steps completed" for turn 2.
  - `lastIndexOf(stepsMatch[0])` then looks for the LAST occurrence of
    "3 steps completed" — which appears exactly once — and anchors on
    the OLDER turn anyway.

The fix `lastIndexOf` was looking for the wrong needle. Replace with
`matchAll` + take the last result: this enumerates every marker
across all turns and picks the most recent one regardless of digit
value. Same change applied to the `Reviewed N sources` path in
Strategy 2.

Behaviour now matches the documented intent of the previous commit;
the existing unit test passes.
@madisonrickert
Copy link
Copy Markdown
Contributor

madisonrickert commented May 12, 2026

Confirming the multi-turn marker bug from an independent repro.

Saw this firsthand across many turns in a single Perplexity chat: comet_ask kept anchoring on the oldest "X steps completed" / "Reviewed N sources" marker in the scroll buffer and slicing the previous turn's answer text out as the new response.

I independently arrived at a different fix for an adjacent slice: a prose-element watermark in extractAgentStatus (madisonrickert/Perplexity-Comet-MCP@b761dc8) that captures [class*="prose"] count at sendPrompt time and only considers prose elements past the watermark. That covers Strategy 3 (prose-block extraction) but doesn't touch Strategies 1 and 2 — which is exactly the gap this PR fills via matchAll + take-last. They compose cleanly: watermark filters prose, lastIndexOf walks markers.

LGTM in spirit. The watermark commit sits on my fork at the SHA above. I was planning to submit that one as its own PR, but feel free to cherry-pick it to land alongside this PR.

Copy link
Copy Markdown
Owner

@RapierCraft RapierCraft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #12 — fix(correctness): tab matching, completion regex, version sync, COMET_PORT

Reviewed commit: 1dffc64


Overall Assessment: APPROVE with minor findings

This is a well-crafted PR with clear commit messages, good test coverage for the regex changes, and proper handling of edge cases. The fixes are correct and the code quality is high. I have one medium-severity issue and a few minor observations.


1. COMET_PORT — Incomplete Migration (MEDIUM)

Issue: src/http-bridge.ts still has two hardcoded 9223 literals (lines 100 and 148) that were NOT updated to use DEFAULT_PORT. The PR correctly exports DEFAULT_PORT from cdp-client.ts and updates index.ts, but the HTTP bridge entry point is missed.

Impact: Users setting COMET_PORT=9333 will get the correct port via index.ts (stdio MCP) but the HTTP bridge (comet-bridge binary) will still hardcode 9223. This contradicts the stated goal of honoring COMET_PORT across the codebase.

Fix: Import DEFAULT_PORT in http-bridge.ts and replace the two 9223 literals.


2. Tab Matching — Correct but Asymmetric (LOW, intentional)

The new findTabByDomain logic:

tabDomain === target || tabDomain.endsWith(`.${target}`)

This is a strict improvement over the old bidirectional includes heuristic. Two observations:

  • TLD matching: Searching for domain: "ai" will still match perplexity.ai (since "perplexity.ai".endsWith(".ai") is true). This is technically correct per DNS semantics but might surprise users. The PR description explicitly calls out the old "ai" matching problem — the new behavior is much narrower but not zero for short TLD-like queries. Consider whether a minimum domain-segment check (e.g., target must contain at least one dot) would be appropriate as a follow-up.

  • Asymmetry is intentional: Searching for gist.github.com will NOT match a tab at github.com. The old code did match (via reverse includes). This is the correct fix — a parent domain tab shouldn't satisfy a request for a specific subdomain.


3. Completion Regex — Correct (VERIFIED)

  • Russian singular: Выполнен(?:о|ы)?\s+\d+\s+шаг(?:а|ов)? correctly matches all three grammatical forms: Выполнен 1 шаг (masc sg), Выполнено 2 шага (neut), Выполнены 5 шагов (pl). Tested.

  • "Finished" negative lookahead: \bFinished\b(?!\s+(?:reading|analyzing|browsing|searching|loading)) correctly rejects intermediate steps while still matching bare "Finished". Tested.

  • Multi-turn matchAll: Using matchAll with /g flag and taking the last match is the correct fix for the multi-turn chat problem. The old match() + indexOf() approach always landed on the first (oldest) occurrence. Clean implementation.


4. Version Sync — Correct (VERIFIED)

The readPackageVersion() function:

  • Resolves relative to import.meta.url — works from both dist/ and node_modules/
  • Falls back to "0.0.0" on error — safe, won't crash startup
  • Reads synchronously at module load — appropriate since this only runs once

Note: current package.json shows version 2.7.0 while main advertises 2.5.0. This PR fixes that drift permanently.


5. WebSocket Leak Fix — Correct (VERIFIED)

The try/finally pattern in listTargets for Windows:

const tempClient = await CDP({...});
try {
  // ... use tempClient
} finally {
  await tempClient.close().catch(() => {});
}

This correctly prevents WebSocket handle leaks when Target.getTargets() throws. The .catch(() => {}) on close() handles the case where the connection is already dead. Good defensive code.


6. Test Coverage — Good

Four new test cases cover the exact regressions described:

  • Russian singular form
  • Russian plural form
  • "Finished reading" intermediate step (with visible stop button)
  • Multi-turn last-match selection

Tests use createElement/textContent instead of innerHTML — clean and safe.


Summary

Finding Severity Status
http-bridge.ts still hardcodes 9223 MEDIUM Needs fix
TLD-only domain query ("ai") still matches LOW Acceptable, note for follow-up
All regex changes Verified correct
Version sync Verified correct
WebSocket leak fix Verified correct

Recommendation: Fix the http-bridge.ts port references for completeness, then this is ready to merge. The regex and matching logic changes are all correct and well-tested.

@RapierCraft RapierCraft merged commit 51a814c into RapierCraft:main May 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants