Skip to content

ci: Move realm server tests into shards#4269

Merged
backspace merged 23 commits intomainfrom
server/group-tests-continued
Mar 30, 2026
Merged

ci: Move realm server tests into shards#4269
backspace merged 23 commits intomainfrom
server/group-tests-continued

Conversation

@backspace
Copy link
Copy Markdown
Contributor

@backspace backspace commented Mar 27, 2026

This is #3831, continued. It groups realm server tests into six shards, because having one job per test file was becoming untenable. See below for further detail on the timing metrics, but high-level: we’re wasting hours of CI time per run with repetitive setup for each test file.

The grouping is via naïve round robin, which means some shards will take longer than others. We could rig up some kind of load balancing using shard history from CI, but that seemed excessive at this point.

We no longer need lint confirmation that every test file is manually listed in the CI workflow, as the grouping finds every file automatically.

I added a test report akin to the host tests, which makes it easier to see failures at a high level instead of having to scroll through Actions job logs:

boxel 2026-03-30 09-29-29 e8c0a21ef0e0f5101a0d9e590435203ca7bf96af 2026-03-30 09-31-44

Metrics for the existing approach

From the last 20 runs on main:

Metric Value
Shards per run 71 (one per test module)
Wall-clock average 18m 1s (first shard start → last shard finish)
Slowest shard ~11m (realm-endpoints-test.ts)
Fastest shard ~4m (most modules)
Avg per shard ~5m 11s
Total billable minutes per run ~370m (~6.2 hours)

~4 minutes of every shard is pure setup overhead (checkout, init, download artifacts, start services, register users). The actual test execution in most modules is only ~1 minute. Running 71 shards means paying that 4-minute overhead 71 times = ~284 minutes of wasted setup per run.

@backspace backspace changed the base branch from main to group-realm-server-tests March 27, 2026 22:44
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

Host Test Results

    1 files  ±0      1 suites  ±0   1h 59m 28s ⏱️ - 12m 41s
2 062 tests ±0  2 047 ✅ ±0  15 💤 ±0  0 ❌ ±0 
2 077 runs  ±0  2 062 ✅ ±0  15 💤 ±0  0 ❌ ±0 

Results for commit eace2d9. ± Comparison against base commit 3ab3d2c.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 28, 2026

Realm Server Test Results

  1 files    1 suites   11m 44s ⏱️
802 tests 802 ✅ 0 💤 0 ❌
873 runs  873 ✅ 0 💤 0 ❌

Results for commit eace2d9.

♻️ This comment has been updated with latest results.

@backspace backspace changed the base branch from group-realm-server-tests to main March 30, 2026 14:42
@backspace backspace changed the title server: Extend test grouping ci: Move realm server tests into shards Mar 30, 2026
@backspace backspace marked this pull request as ready for review March 30, 2026 17:49
@backspace backspace requested a review from a team March 30, 2026 17:49
Copy link
Copy Markdown
Contributor

@lukemelia lukemelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does AGENTS.md need updating?

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48365eaac5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@backspace
Copy link
Copy Markdown
Contributor Author

Does AGENTS.md need updating?

I missed that thanks, it now has an extra line about how to run multiple test files at once.

@habdelra habdelra requested a review from Copilot March 30, 2026 19:10
@backspace backspace merged commit f0ce138 into main Mar 30, 2026
52 checks passed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures realm-server CI to run tests in 6 shards (instead of one job per test module) to reduce repeated setup overhead, and adds JUnit reporting to improve visibility into failures.

Changes:

  • Replace per-test-module CI matrix with a 6-shard matrix and compute module lists per shard automatically.
  • Add support for TEST_MODULES in packages/realm-server/tests/index.ts to filter the test suite to a subset of modules.
  • Add per-shard JUnit output and a workflow job to merge + publish the combined test report.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
README.md Updates realm-server test-running instructions, including TEST_MODULES.
AGENTS.md Documents running multiple modules via TEST_MODULES.
packages/realm-server/tests/scripts/run-qunit-with-test-pg.sh Optionally attaches a JUnit reporter when JUNIT_OUTPUT_FILE is set.
packages/realm-server/tests/indexing-test.ts Tweaks a test to wait for coalesced indexing URLs before asserting.
packages/realm-server/tests/index.ts Implements TEST_MODULES parsing and QUnit module filtering.
packages/realm-server/scripts/shard-test-modules.js Discovers and round-robins *-test.ts modules into shards for CI.
packages/realm-server/scripts/run-test-modules.js Utility runner that applies a QUnit filter derived from TEST_MODULES.
packages/realm-server/scripts/lint-test-shards.ts Removes the previous CI lint that required enumerating every test module in the workflow.
packages/realm-server/scripts/junit-reporter.js Adds a QUnit event-based JUnit XML reporter used in CI.
packages/realm-server/package.json Updates test:wait-for-servers to run test (filtered by TEST_MODULES) and removes lint:test-shards.
.github/workflows/ci.yaml Converts realm-server tests to a 6-shard matrix, uploads per-shard JUnit, merges & publishes results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Run a single module:
`TEST_MODULE=card-endpoints-test.ts pnpm test-module`
- Run a list of modules:
`TEST_MODULES=card-endpoints-test.ts|another-module-test.ts pnpm test`
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TEST_MODULES example needs quoting (or escaping) because | is a shell pipe. As written, TEST_MODULES=card-endpoints-test.ts|another-module-test.ts pnpm test will try to pipe to another-module-test.ts as a command instead of setting the env var. Use TEST_MODULES="card-endpoints-test.ts|another-module-test.ts" pnpm test (or single quotes).

Suggested change
`TEST_MODULES=card-endpoints-test.ts|another-module-test.ts pnpm test`
`TEST_MODULES="card-endpoints-test.ts|another-module-test.ts" pnpm test`

Copilot uses AI. Check for mistakes.
Comment on lines +1493 to +1499
if (rows.length !== 1) {
return undefined;
}
let urls = rows[0].args.changes
.map((change) => change.url)
.sort();
return urls.length === expectedUrls.length ? rows[0] : undefined;
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The waitUntil predicate only checks that the job contains the expected count of URLs, not that the set of URLs matches expectedUrls. That can cause the wait to terminate early (and then fail the later assertion) instead of waiting until the coalesced URLs are actually present, which makes the test more flaky. Consider comparing the sorted URL list to the expected sorted list inside the predicate and only returning the row when they match.

Copilot uses AI. Check for mistakes.
cp -a .test-web-assets-artifact/. ./
- name: Compute shard test modules
id: shard_modules
run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} 6)" >> "$GITHUB_OUTPUT"
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step hardcodes the total shard count as 6 even though the matrix already defines shardTotal. To avoid drift if the shard count changes later, pass ${{ matrix.shardTotal }} to shard-test-modules.js instead of a constant.

Suggested change
run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} 6)" >> "$GITHUB_OUTPUT"
run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} ${{ matrix.shardTotal }})" >> "$GITHUB_OUTPUT"

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +27
JUNIT_REPORTER_ARGS=""
if [ -n "${JUNIT_OUTPUT_FILE-}" ]; then
JUNIT_REPORTER_ARGS="--require ${SCRIPT_DIR}/../../scripts/junit-reporter.js"
fi

LOG_LEVELS="$EFFECTIVE_LOG_LEVELS" \
NODE_NO_WARNINGS=1 \
PGPORT=55436 \
STRIPE_WEBHOOK_SECRET=stripe-webhook-secret \
STRIPE_API_KEY=stripe-api-key \
qunit --require ts-node/register/transpile-only "$@" tests/index.ts
qunit --require ts-node/register/transpile-only $JUNIT_REPORTER_ARGS "$@" tests/index.ts
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$JUNIT_REPORTER_ARGS is expanded unquoted, and it's built as a single string that contains both the flag and the path. This relies on word-splitting and can break if the workspace path contains spaces (or other IFS characters). Consider using a Bash array for optional args (e.g., JUNIT_REPORTER_ARGS=(--require "...")) and expanding it as "${JUNIT_REPORTER_ARGS[@]}".

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants