ci: Move realm server tests into shards by backspace · Pull Request #4269 · cardstack/boxel

backspace · 2026-03-27T22:43:51Z

This is #3831, continued. It groups realm server tests into six shards, because having one job per test file was becoming untenable. See below for further detail on the timing metrics, but high-level: we’re wasting hours of CI time per run with repetitive setup for each test file.

The grouping is via naïve round robin, which means some shards will take longer than others. We could rig up some kind of load balancing using shard history from CI, but that seemed excessive at this point.

We no longer need lint confirmation that every test file is manually listed in the CI workflow, as the grouping finds every file automatically.

I added a test report akin to the host tests, which makes it easier to see failures at a high level instead of having to scroll through Actions job logs:

e8c0a21ef0e0f5101a0d9e590435203ca7bf96af 2026-03-30 09-31-44

Metrics for the existing approach

From the last 20 runs on main:

Metric	Value
Shards per run	71 (one per test module)
Wall-clock average	18m 1s (first shard start → last shard finish)
Slowest shard	~11m (`realm-endpoints-test.ts`)
Fastest shard	~4m (most modules)
Avg per shard	~5m 11s
Total billable minutes per run	~370m (~6.2 hours)

~4 minutes of every shard is pure setup overhead (checkout, init, download artifacts, start services, register users). The actual test execution in most modules is only ~1 minute. Running 71 shards means paying that 4-minute overhead 71 times = ~284 minutes of wasted setup per run.

# Conflicts: # .github/workflows/ci.yaml # README.md # packages/realm-server/tests/index.ts

github-actions · 2026-03-27T23:05:49Z

Host Test Results

1 files ±0 1 suites ±0 1h 59m 28s ⏱️ - 12m 41s
2 062 tests ±0 2 047 ✅ ±0 15 💤 ±0 0 ❌ ±0
2 077 runs ±0 2 062 ✅ ±0 15 💤 ±0 0 ❌ ±0

Results for commit eace2d9. ± Comparison against base commit 3ab3d2c.

♻️ This comment has been updated with latest results.

github-actions · 2026-03-27T23:38:36Z

Preview deployments

github-actions · 2026-03-28T00:13:50Z

Realm Server Test Results

1 files 1 suites 11m 44s ⏱️
802 tests 802 ✅ 0 💤 0 ❌
873 runs 873 ✅ 0 💤 0 ❌

Results for commit eace2d9.

♻️ This comment has been updated with latest results.

…ntinued

lukemelia

Does AGENTS.md need updating?

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 48365eaac5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

.github/workflows/ci.yaml

packages/realm-server/scripts/junit-reporter.js

backspace · 2026-03-30T18:12:04Z

Does AGENTS.md need updating?

I missed that thanks, it now has an extra line about how to run multiple test files at once.

Copilot

Pull request overview

This PR restructures realm-server CI to run tests in 6 shards (instead of one job per test module) to reduce repeated setup overhead, and adds JUnit reporting to improve visibility into failures.

Changes:

Replace per-test-module CI matrix with a 6-shard matrix and compute module lists per shard automatically.
Add support for TEST_MODULES in packages/realm-server/tests/index.ts to filter the test suite to a subset of modules.
Add per-shard JUnit output and a workflow job to merge + publish the combined test report.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
README.md	Updates realm-server test-running instructions, including `TEST_MODULES`.
AGENTS.md	Documents running multiple modules via `TEST_MODULES`.
packages/realm-server/tests/scripts/run-qunit-with-test-pg.sh	Optionally attaches a JUnit reporter when `JUNIT_OUTPUT_FILE` is set.
packages/realm-server/tests/indexing-test.ts	Tweaks a test to wait for coalesced indexing URLs before asserting.
packages/realm-server/tests/index.ts	Implements `TEST_MODULES` parsing and QUnit module filtering.
packages/realm-server/scripts/shard-test-modules.js	Discovers and round-robins `*-test.ts` modules into shards for CI.
packages/realm-server/scripts/run-test-modules.js	Utility runner that applies a QUnit filter derived from `TEST_MODULES`.
packages/realm-server/scripts/lint-test-shards.ts	Removes the previous CI lint that required enumerating every test module in the workflow.
packages/realm-server/scripts/junit-reporter.js	Adds a QUnit event-based JUnit XML reporter used in CI.
packages/realm-server/package.json	Updates `test:wait-for-servers` to run `test` (filtered by `TEST_MODULES`) and removes `lint:test-shards`.
.github/workflows/ci.yaml	Converts realm-server tests to a 6-shard matrix, uploads per-shard JUnit, merges & publishes results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-30T20:05:44Z

AGENTS.md

 - Run a single module:
  `TEST_MODULE=card-endpoints-test.ts pnpm test-module`
+- Run a list of modules:
+  `TEST_MODULES=card-endpoints-test.ts|another-module-test.ts pnpm test`


The TEST_MODULES example needs quoting (or escaping) because | is a shell pipe. As written, TEST_MODULES=card-endpoints-test.ts|another-module-test.ts pnpm test will try to pipe to another-module-test.ts as a command instead of setting the env var. Use TEST_MODULES="card-endpoints-test.ts|another-module-test.ts" pnpm test (or single quotes).

Suggested change

`TEST_MODULES=card-endpoints-test.ts|another-module-test.ts pnpm test`

`TEST_MODULES="card-endpoints-test.ts|another-module-test.ts" pnpm test`

Copilot · 2026-03-30T20:05:44Z

packages/realm-server/tests/indexing-test.ts

+              if (rows.length !== 1) {
+                return undefined;
+              }
+              let urls = rows[0].args.changes
+                .map((change) => change.url)
+                .sort();
+              return urls.length === expectedUrls.length ? rows[0] : undefined;


The waitUntil predicate only checks that the job contains the expected count of URLs, not that the set of URLs matches expectedUrls. That can cause the wait to terminate early (and then fail the later assertion) instead of waiting until the coalesced URLs are actually present, which makes the test more flaky. Consider comparing the sorted URL list to the expected sorted list inside the predicate and only returning the row when they match.

Copilot · 2026-03-30T20:05:45Z

.github/workflows/ci.yaml

          cp -a .test-web-assets-artifact/. ./
+      - name: Compute shard test modules
+        id: shard_modules
+        run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} 6)" >> "$GITHUB_OUTPUT"


This step hardcodes the total shard count as 6 even though the matrix already defines shardTotal. To avoid drift if the shard count changes later, pass ${{ matrix.shardTotal }} to shard-test-modules.js instead of a constant.

Suggested change

run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} 6)" >> "$GITHUB_OUTPUT"

run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} ${{ matrix.shardTotal }})" >> "$GITHUB_OUTPUT"

Copilot · 2026-03-30T20:05:45Z

packages/realm-server/tests/scripts/run-qunit-with-test-pg.sh

+JUNIT_REPORTER_ARGS=""
+if [ -n "${JUNIT_OUTPUT_FILE-}" ]; then
+  JUNIT_REPORTER_ARGS="--require ${SCRIPT_DIR}/../../scripts/junit-reporter.js"
+fi
+
 LOG_LEVELS="$EFFECTIVE_LOG_LEVELS" \
 NODE_NO_WARNINGS=1 \
 PGPORT=55436 \
 STRIPE_WEBHOOK_SECRET=stripe-webhook-secret \
 STRIPE_API_KEY=stripe-api-key \
-qunit --require ts-node/register/transpile-only "$@" tests/index.ts
+qunit --require ts-node/register/transpile-only $JUNIT_REPORTER_ARGS "$@" tests/index.ts


$JUNIT_REPORTER_ARGS is expanded unquoted, and it's built as a single string that contains both the flag and the path. This relies on word-splitting and can break if the workspace path contains spaces (or other IFS characters). Consider using a Bash array for optional args (e.g., JUNIT_REPORTER_ARGS=(--require "...")) and expanding it as "${JUNIT_REPORTER_ARGS[@]}".

IanCal and others added 4 commits January 13, 2026 09:43

Run chunks of realm-server tests on ci

797772e

Run multiple modules as a group in CI

f1139a3

Call original test command

3cc9065

Merge branch 'main' into server/group-tests-continued

ebd54d2

# Conflicts: # .github/workflows/ci.yaml # README.md # packages/realm-server/tests/index.ts

backspace changed the base branch from main to group-realm-server-tests March 27, 2026 22:44

backspace added 3 commits March 27, 2026 16:33

Fix test

7c01d8d

Fix lint errors

2e79398

Replace manual sharding with script

5527bb7

backspace added 3 commits March 27, 2026 16:46

Add reporting

3def21c

Add missed script

633390c

Fix lint error

08b469d

Merge remote-tracking branch 'origin/main' into server/group-tests-co…

d1d5c38

…ntinued

backspace changed the base branch from group-realm-server-tests to main March 30, 2026 14:42

backspace added 4 commits March 30, 2026 07:48

Restore deleted paragraph

63a1472

Merge remote-tracking branch 'origin/main' into server/group-tests-co…

b466d12

…ntinued

Remove test shards lintinng

a9d868c

Change to six shards

0e6e753

backspace changed the title ~~server: Extend test grouping~~ ci: Move realm server tests into shards Mar 30, 2026

backspace added 5 commits March 30, 2026 08:40

Add deliberate test failure

191a8cd

Remove custom shard name

6af24f2

Remove deliberate test failure

5ad24c3

Change matrix variables to match

8520c5d

Fix ordering

48365ea

backspace marked this pull request as ready for review March 30, 2026 17:49

backspace requested a review from a team March 30, 2026 17:49

lukemelia approved these changes Mar 30, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 30, 2026

View reviewed changes

.github/workflows/ci.yaml Outdated Show resolved Hide resolved

packages/realm-server/scripts/junit-reporter.js Outdated Show resolved Hide resolved

backspace added 3 commits March 30, 2026 11:05

Add multi-file test explanation to AGENTS.md

521e7e6

Add filter on merge realm server reports job

49e2c98

Fix grouping by name

eace2d9

habdelra requested a review from Copilot March 30, 2026 19:10

backspace merged commit f0ce138 into main Mar 30, 2026
52 checks passed

Copilot started reviewing on behalf of habdelra March 30, 2026 19:45 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

backspace mentioned this pull request Mar 30, 2026

ci: Add fixes for realm server test shards #4282

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: Move realm server tests into shards#4269

ci: Move realm server tests into shards#4269
backspace merged 23 commits intomainfrom
server/group-tests-continued

backspace commented Mar 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

lukemelia left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

backspace commented Mar 30, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Copilot AI Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	`TEST_MODULES=card-endpoints-test.ts\|another-module-test.ts pnpm test`
	`TEST_MODULES="card-endpoints-test.ts\|another-module-test.ts" pnpm test`

	run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} 6)" >> "$GITHUB_OUTPUT"
	run: echo "modules=$(node scripts/shard-test-modules.js ${{ matrix.shardIndex }} ${{ matrix.shardTotal }})" >> "$GITHUB_OUTPUT"

Conversation

backspace commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Metrics for the existing approach

Uh oh!

github-actions bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Host Test Results

Uh oh!

github-actions bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview deployments

Uh oh!

github-actions bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Realm Server Test Results

Uh oh!

lukemelia left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

backspace commented Mar 30, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

backspace commented Mar 27, 2026 •

edited

Loading

github-actions bot commented Mar 27, 2026 •

edited

Loading

github-actions bot commented Mar 27, 2026 •

edited

Loading

github-actions bot commented Mar 28, 2026 •

edited

Loading