Skip to content

fix(ask): require fresh response in completion branches#19

Merged
RapierCraft merged 1 commit into
RapierCraft:mainfrom
sergei-aronsen:fix/stale-answer-detection
May 16, 2026
Merged

fix(ask): require fresh response in completion branches#19
RapierCraft merged 1 commit into
RapierCraft:mainfrom
sergei-aronsen:fix/stale-answer-detection

Conversation

@sergei-aronsen
Copy link
Copy Markdown

Summary

Each completion branch in `comet_ask` returned `status.response` whenever it looked plausible:

  1. Explicit `status.status === 'completed'` + saw-new-response
  2. `status.isStable` + no stop button
  3. Idle timeout > 6s + substantial response

What's missing in all three is "this response is actually the answer to the prompt we just sent". `extractAgentStatus` can still pick up the previous turn's answer when:

  • the new prompt is sent into the same chat tab
  • Perplexity has not yet overwritten the prose / steps marker
  • the stop-button hasn't appeared yet (slow render)
  • `sawNewResponse` is already true because a prose count blipped during the "Thinking…" placeholder render

Each branch then returns the prior turn's text as the answer to the new question. Silent wrong-answer — the single most dangerous correctness defect in the polling logic.

Fix

Snapshot the full `extractAgentStatus().response` BEFORE sending the prompt, and require `status.response !== oldResponseSnapshot` in all three completion branches plus the max-timeout fallback. If everything we see is still the old answer, fall through to the "in progress" branch and let the caller poll again.

`oldResponseSnapshot` is captured in a `try { … } catch { }` so a failing pre-send status check doesn't break the prompt flow — the guard simply degrades to "any non-empty response is fresh".

Compatibility

No public API change. `comet_ask` returns the same shape; it just no longer returns previous-turn answers. Worst case (snapshot capture fails) is identical to current behaviour.

Test plan

  • `npm run test:unit` passes
  • Manual: in same tab, run `comet_ask Q1` then `comet_ask Q2` — Q2 returns Q2's answer, not Q1's
  • Manual: `comet_ask` against an idle tab returns the current answer once, never twice
  • Manual: timeout case correctly returns "in progress" when no fresh response exists

🤖 Generated with Claude Code

…n answers)

Each completion branch in `comet_ask` returned `status.response`
whenever it looked plausible:

  - explicit `status.status === 'completed'` + saw-new-response
  - `status.isStable` + no stop button
  - idle timeout > 6s + substantial response

What's missing in all three is "this response is actually the answer
to the prompt we just sent". `extractAgentStatus`, even after the
`lastIndexOf` fix in the companion correctness PR, can still pick up
the PREVIOUS turn's answer when:

  - the new prompt is sent into the same chat tab
  - Perplexity has not yet visibly overwritten the prose / steps marker
  - the stop-button hasn't appeared yet (slow render)
  - `sawNewResponse` is already true because a prose count went up
    briefly during the "Thinking…" placeholder render

Each branch therefore happily returned the prior turn's text as the
answer to the new question. Silent wrong-answer — the single most
dangerous correctness defect in the polling logic.

Fix: snapshot the full `extractAgentStatus().response` BEFORE sending
the prompt, and require `status.response !== oldResponseSnapshot` for
all three completion branches plus the max-timeout fallback. If
everything we see is still the old answer, fall through to the
"in progress" branch and let the caller poll again.

`oldResponseSnapshot` is captured in a `try { … } catch { }` so a
failing pre-send status check doesn't break the prompt flow — the
guard simply degrades to "any non-empty response is fresh".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@madisonrickert
Copy link
Copy Markdown
Contributor

madisonrickert commented May 12, 2026

Confirming the bug from an independent repro and adding context on a complementary fix.

I hit this during a long multi-turn agentic-browsing session, polling at ~30-90s cadence between batches. Symptom matched the diagnosis here exactly: comet_poll returned the previous turn's answer as the response to the current question, with sawNewResponse already true because of an intermediate prose blip during the "Thinking…" render.

The pre-send snapshot approach here is a clean defense at the return path and catches the failure regardless of which extraction strategy fires. Worth knowing about: I shipped two complementary local fixes that sit at different layers, so they compose rather than compete with this PR.

  1. comet_poll cache TTL (madisonrickert/Perplexity-Comet-MCP@992399e). Different code path: lastResponse cached on sessionState is evicted after 30s in a readCachedResponse() helper. Once expired, comet_poll falls through to IDLE rather than restating an old answer. This catches the "polling between tasks" case that fix(ask): require fresh response in completion branches #19 doesn't (the pre-send snapshot is scoped to a single comet_ask call).

  2. Prose watermark in extractAgentStatus (madisonrickert/Perplexity-Comet-MCP@b761dc8). Snapshots [class*="prose"] count at sendPrompt time and slices the prose array from that watermark forward inside the page eval. Strategy 3 literally can't see stale prose. Strategies 1 and 2 (marker-anchored) still walked full bodyText with a first-match — which is exactly what fix(correctness): tab matching, completion regex, version sync, COMET_PORT #12 fixes via lastIndexOf, so fix(correctness): tab matching, completion regex, version sync, COMET_PORT #12 + watermark together cover all three extraction strategies inside extractAgentStatus.

One edge case worth flagging on this PR's approach: if a user asks the same question twice and the page-derived answer is byte-identical to the previous turn (terse yes/no, canonical short answer), responseIsFresh flips false and we fall through to "in progress." Probably uncommon with Perplexity prose, plausible for very short answers — not a blocker, but worth a note in the body or a follow-up.

Both commits sit on my fork at the SHAs above. I was planning to open PRs for those, but you're welcome to absorb into this PR if that's easier.

@RapierCraft
Copy link
Copy Markdown
Owner

Code Review: fix(ask): require fresh response in completion branches

Reviewed commit: 0849748

Overall Assessment: APPROVE with minor observations

This is a well-reasoned fix for a real and dangerous bug class — returning the previous turn's answer as if it were the current one. The approach is sound: snapshot the full response text before sending the prompt, then gate all completion branches on response !== oldResponseSnapshot.


Stale Answer Detection Logic

Correctness: Good.

The core invariant is clear and correctly applied: a response that is byte-for-byte identical to the pre-prompt snapshot cannot be the answer to the new prompt. The responseIsFresh variable encapsulates this as:

const responseIsFresh = !!status.response && status.response !== oldResponseSnapshot;

This correctly handles:

  • Empty response (fails !!status.response) — still "not fresh"
  • Identical response (fails !== check) — stale
  • Any new/different response — fresh

One edge case to be aware of (non-blocking): If Perplexity returns the exact same text for two different questions (e.g., "What is 2+2?" asked twice), the freshness check would incorrectly classify the second answer as stale. However, this is extremely unlikely in practice since Perplexity responses include timestamps, source citations, and formatting that vary between invocations. The fix correctly trades this theoretical false-negative for eliminating a real class of silent wrong-answers. Acceptable tradeoff.


Race Conditions

No new race conditions introduced. Existing ones mitigated.

  1. Snapshot timing: oldResponseSnapshot is captured before sendPrompt(), which is correct. There's no window where the new response could arrive before the snapshot is taken.

  2. getAgentStatus() failure during snapshot: Correctly wrapped in try/catch, degrading to oldResponseSnapshot = ''. When empty, the freshness check becomes !!status.response && status.response !== '', which is equivalent to !!status.response — identical to the pre-fix behavior. So failure mode = no regression. Good.

  3. Potential concern (informational, not blocking): Between the oldResponseSnapshot capture and sendPrompt(), there's no race because both are sequential await calls on the same async flow — no concurrent mutation path exists.


Completion Branch Correctness

All three completion branches plus the max-timeout fallback consistently apply responseIsFresh:

Branch Before After Correct?
1. Explicit completion status.response (truthy) responseIsFresh Yes
2. Stable + no stop button status.response (truthy) responseIsFresh Yes
3. Idle timeout (>6s) status.response (truthy) responseIsFresh Yes
4. Max timeout fallback finalStatus.response (truthy + len>50) adds !== oldResponseSnapshot Yes

The substitution is consistent — responseIsFresh replaces the bare truthiness check in branches 1-3, and the max-timeout branch adds the comparison inline (since responseIsFresh is scoped inside the while loop). This is correct because finalStatus is fetched after the loop exits.


Minor Observations (non-blocking)

  1. oldResponseSnapshot scope: Declared with let at the outer function scope — correctly accessible both inside the while loop (branches 1-3) and after it (branch 4). No scoping issue.

  2. Diff base alignment: Verify the branch is rebased on latest main before merging to avoid conflicts with any concurrent changes to the prose-state capture logic.

  3. Comment quality: The inline comments explaining why the freshness check exists and what happens on degradation are excellent. They will prevent future contributors from removing the guard without understanding its purpose.


Verdict

APPROVED. The fix is correct, minimal, and handles failure modes gracefully. The stale-answer detection is sound, no race conditions are introduced, and all completion branches are consistently guarded. The degradation path (snapshot capture fails) preserves pre-fix behavior. Safe to merge.

@RapierCraft RapierCraft merged commit db87cd6 into RapierCraft:main May 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants