Skip to content

fix(test): harden work.test.ts runKb against CI spawn-timeout flake#34

Merged
vilosource merged 1 commit into
developfrom
fix/work-test-spawn-flake
May 30, 2026
Merged

fix(test): harden work.test.ts runKb against CI spawn-timeout flake#34
vilosource merged 1 commit into
developfrom
fix/work-test-spawn-flake

Conversation

@vilosource
Copy link
Copy Markdown
Owner

@vilosource vilosource commented May 30, 2026

What

Hardens the runKb helper in tests/cli/work.test.ts against the CI flake tracked in #25.

Root cause

runKb ran spawnSync with a 10s timeout and returned exitCode: result.status ?? 1. When the node subprocess was killed (timeout or signal) under loaded 2-core CI with cross-file vitest parallelism, result.status is null → coerced to 1. That surfaced as the misleading AssertionError: expected 1 to be +0 in clears handoff with --clear — a spawn-timing race, not a real exit-1.

Change

  • Bump spawn timeout 10s → 30s (cold node start + module load can exceed 10s under contention).
  • Retry once on status === null (the kill/timeout signature).
  • If it still doesn't exit cleanly, throw a real diagnostic (signal + error) instead of a coerced exit code — future failures will be debuggable rather than a bare 1.
  • Return result.status directly (guaranteed non-null at the return).

Test-infra only; no production code touched.

Verification

  • tests/cli/work.test.ts: 55/55 pass.
  • Full suite: 659/659 pass after rebuilding the local CLI bundle. (A transient local failure in cli-bundle.test.ts was an unrelated stale-artifact: the pre-existing local dist/cli-bundle/cli.js reported 0.2.1 vs package.json 0.3.0 post-release; rebuilding the bundle → 0.3.0, 7/7 pass. CI freshly bundles in a pre-step, so it does not hit this.)
  • eslint clean on the changed file.

Closes #25.

Summary by CodeRabbit

Release Notes

  • Tests
    • Improved CLI testing robustness with extended timeout handling and automatic retry logic for subprocess execution.

Note: This release contains internal testing infrastructure improvements only. No user-facing changes included.

Review Change Stack

The runKb helper coerced a killed-subprocess (status === null, from the
10s spawnSync timeout or a signal) into exitCode 1, surfacing as the
misleading 'expected 1 to be +0' flake in 'clears handoff with --clear'
under loaded 2-core CI (issue #25).

- Bump the spawn timeout 10s -> 30s (cold node start + module load under
  cross-file vitest parallelism can exceed 10s).
- Retry once on status === null (the kill/timeout signature), then throw a
  real diagnostic (signal + error) instead of a coerced exit code.
- Return result.status directly (no longer null at this point).

Closes #25.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 56bbe334-1f6d-4bb7-952d-76ae892a97ec

📥 Commits

Reviewing files that changed from the base of the PR and between 46db0ed and 8f0c6fd.

📒 Files selected for processing (1)
  • tests/cli/work.test.ts

📝 Walkthrough

Walkthrough

The test helper runKb is hardened to reduce CI flakiness by increasing the subprocess timeout from 10s to 30s, retrying execution once when the process terminates by signal, and throwing diagnostic errors that include signal and error details on repeated failures, instead of silently applying a fallback exit code.

Changes

CLI Test Runner Resilience

Layer / File(s) Summary
Resilient subprocess execution with retry and timeout
tests/cli/work.test.ts
runKb wraps spawnSync execution with 30s timeout, retries once on signal/timeout (status === null), and throws diagnostic error with signal and error message on repeated failure instead of using ?? 1 fallback. Returns raw result.status from successful execution.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit's blessing on retry's art,
When signals timeout, we play the part—
Thirty seconds stretched, one more we try,
No silent falls, just truth we cry.
CI flakes shall flee, our tests stand spry! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: hardening the runKb test helper against CI spawn-timeout flakes in work.test.ts.
Linked Issues check ✅ Passed The PR directly addresses issue #25 by implementing the suggested fixes: increased spawnSync timeout (10s→30s), retry logic on null status, and better error handling instead of coerced exit codes.
Out of Scope Changes check ✅ Passed All changes are scoped to the runKb helper in tests/cli/work.test.ts and directly address the CI flake issue; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vilosource vilosource merged commit b7fa37a into develop May 30, 2026
3 checks passed
@vilosource vilosource deleted the fix/work-test-spawn-flake branch May 30, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky CI: tests/cli/work.test.ts > 'clears handoff with --clear' (work handoff --clear exits 1 under CI load)

1 participant