docs(#96): test hygiene convention + empty-CCXRAY_HOME CI backstop#98
Merged
Conversation
Tests must point CCXRAY_HOME at a throwaway temp dir with their own synthetic index.ndjson and never read the real ~/.ccxray — the fallback that made PR #94's usage e2e tests pass locally but fail in empty-home CI. Adds docs/testing.md (4 rules, canonical pattern from usage.test.js, the $HOME-vs-CCXRAY_HOME distinction incl. the puppeteer Chrome-cache caveat) and a short Test Hygiene section + pointer in CLAUDE.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a test step that points CCXRAY_HOME at a fresh empty dir under $RUNNER_TEMP, as a backstop against the PR #94 failure class: a test that reads the real ~/.ccxray now finds no logs and fails the build. $HOME is left untouched so puppeteer's Chrome cache stays intact. CCXRAY_HOME is set at the step (not job) level via the $RUNNER_TEMP shell var — the runner context is not available in jobs.<id>.env. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #96.
What
Documents the test-hygiene convention that prevents the PR #94 failure class (tests falling back to the real
~/.ccxray, passing locally but failing on an empty-home CI runner), and adds a CI backstop.docs/testing.md(new) — 4 rules (isolateCCXRAY_HOME, synthetic-only fixtures, CI-equivalent check, cleanup), the canonical pattern fromtest/usage.test.js, and the$HOMEvsCCXRAY_HOMEdistinction including the puppeteer Chrome-cache caveat.CLAUDE.md— short Test Hygiene section + pointer + maintenance rule (matches the wire-protocol doc precedent)..github/workflows/ci.yml— runs the suite withCCXRAY_HOMEpointed at a fresh empty dir under$RUNNER_TEMP, as a backstop.$HOMEleft untouched so the puppeteer browser e2e tests keep their Chrome cache.Why this shape
The isolation pattern (
mkdtemp+ ownindex.ndjson) is already practiced across the suite — this just writes it down and adds a mechanical backstop. The backstop is honest about its scope: it guarantees no real~/.ccxrayread and an empty starting state (catching the #94 class); per-test isolation (rule 1) remains the real guard.Verification
CCXRAY_HOME=$(mktemp -d) npm test→ suite green (isolation condition holds).$HOMEfor the whole suite breaks 2 puppeteer e2e tests (Chrome lives at$HOME/.cache/puppeteer) — which is why the CI backstop scrubsCCXRAY_HOME, not$HOME.ci.ymlvalidated as parseable YAML;runnercontext confirmed unavailable injobs.<id>.env(hence the step-level$RUNNER_TEMPapproach).Codex review
Ran the codex second-review gate. All findings addressed: the blocker (
${{ runner.temp }}invalid at job-levelenv) is fixed via a step-level$RUNNER_TEMP; the should-fix/nit (overstated enforcement claims) reworded to describe a backstop, not full per-test isolation.Out of scope (side finding)
While running the full suite repeatedly I observed pre-existing non-deterministic flakiness in the proxy/websocket e2e tests under parallel execution (different suites fail run-to-run: SSE streaming proxy, OpenAI Responses raw capture/WebSocket proxy). They pass in isolation. This is unrelated to this docs change (which touches no executable test/server code) but likely affects CI reliability generally — worth a separate issue.
🤖 Generated with Claude Code