test(#100): fix e2e flakiness — condition-based polling instead of fixed sleeps#102
Merged
Conversation
The proxy writes logs from a spawned child process, so tests can only observe completion by polling the filesystem/API. Reading index.ndjson or log files after a fixed `setTimeout(500)` flaked under parallel-suite load: the guessed delay sometimes elapsed before the cross-process write landed, e.g. "expected OpenAI index entry" in startup.test.js. The suite failed a different e2e subtest run-to-run; each passed in isolation. Adds poll-until helpers (waitFor/readIndex in startup.test.js, waitForIndexEntries in websocket-proxy.test.js) modeled on the existing waitForPort/waitForIndexEntry, and converts the sleep-then-read sites to wait for the actual condition (entry present, file count, API/SSE state). This also speeds up the fast path — a write that lands early returns immediately instead of always waiting 500ms. Verified: CCXRAY_HOME=$(mktemp -d) node --test test/*.test.js, 6 consecutive runs, 947/0, zero flaky, ~48s each. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…iew) Codex review of #102 flagged that waitFor awaited fn() before checking the deadline, so an async predicate that never settles (e.g. httpGet to a dead proxy) could hang the test instead of timing out. Race fn() against the remaining budget and clear the loser timer so it doesn't linger on the event loop. Also give httpGet its own 5s socket timeout so a silent no-response rejects (and frees the socket) rather than blocking forever. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
|
Codex 二審:no blocker. One should-fix addressed in f46cbea.
Re-verified: 9 consecutive full-suite runs via |
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #100.
Root cause (5 Whys)
The proxy writes logs from a spawned child process. Several e2e tests read
index.ndjson/ log files after a fixedawait setTimeout(500)and then assert. Under the parallel full-suite run (node --test test/*.test.js), many e2e files each spawn a real server, saturating CPU/IO, so the child's async write sometimes lands after the guessed delay → the read finds nothing →expected OpenAI index entryand friends. Each test passes in isolation; the suite failed a different subtest run-to-run.The codebase already had the right pattern (
waitForPortpolls/_api/health;waitForIndexEntrypolls the index) — it just wasn't applied to these sleep-then-read sites.Change
waitFor+readIndex(startup.test.js),waitForIndexEntries(websocket-proxy.test.js), modeled on the existing pollers.Scope: only the proven read-gating waits. A few unrelated fixed delays remain (error-recovery health checks, inter-turn WS pacing, teardown settle) — they aren't the observed flaky class and the terminal condition-poll now absorbs upstream timing.
Verification
CCXRAY_HOME=$(mktemp -d) node --test test/*.test.js— 6 consecutive runs, 947 pass / 0 fail, zero flaky, ~48s each. Files also pass in isolation (startup 62/62, ws 17/17).Note: while verifying I hit an unrelated artifact — running the suite back-to-back through the
npm/shell-pipe wrapper occasionally failed to return even thoughnode --testitself completed and printed its summary. Runningnode --testdirectly exits cleanly every time. Not caused by this change; flagging in case it recurs in CI.🤖 Generated with Claude Code