fix(tests): stabilise e2e suite on macOS CI runners#50
Merged
mickvandijke merged 1 commit intomainfrom Apr 21, 2026
Merged
Conversation
The recurring `InsufficientPeers("Got 6 quotes, need 7")` flake on
macos-latest runners was a test-infrastructure problem, not a client
bug. Two changes close it:
1. Bump `DEFAULT_NODE_COUNT` from `CLOSE_GROUP_SIZE + 1` (8) to
`CLOSE_GROUP_SIZE * 2` (14). The old value left zero slack - quote
collection needs `CLOSE_GROUP_SIZE` peers to respond, so a single
slow peer failed the whole test. Doubling the group gives a full
extra group of redundancy.
2. Add `test_client_config()` helper with 60s quote/store timeouts
(prod default is 10s). E2E tests run a full P2P network in one
CI VM; macOS GitHub runners are nested-virt and roughly half the
CPU throughput of Linux runners, so the 8-node QUIC handshake
burst routinely took >10s per peer under load. Linux runners
finished in time; macOS did not. Prod defaults stay at 10s -
this only affects the loopback MiniTestnet.
The merkle suite already used 120s for the same reason; this brings
the rest of the e2e suite into line at 60s.
mickvandijke
approved these changes
Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
The E2E suite has been recurrently failing on
macos-latestwith:The same root cause surfaces as
test_chunk_exists,test_chunk_put_duplicate_skips_payment,test_payment_required_enforcementand a few others. Always 6-of-7 on macOS; Linux runners pass consistently.Why only macOS
This looked weird — "it's slow networking" doesn't explain a platform split. It turns out to be test-infrastructure, not a client bug:
1. Zero redundancy in the testnet.
CLOSE_GROUP_SIZEis 7, andDEFAULT_NODE_COUNTwasCLOSE_GROUP_SIZE + 1 = 8. Quote collection gates onquotes.len() >= CLOSE_GROUP_SIZE, i.e. all 7 of the remote peers (the client is the 8th) must answer. One slow peer = fail. The 2x over-query is a no-op because only 8 nodes exist.2. macOS GitHub runners are roughly half the CPU throughput of Linux runners. They're nested-virt on Anka/VMware. Spawning 8 QUIC nodes in-process and handshaking to all of them simultaneously saturates the available CPU during the handshake burst. On Linux, all 8 handshakes finish comfortably under the 10s per-peer timeout. On macOS, one or two routinely don't — the test sees 6 successes, one timeout failure, and no way to reach quorum.
Put together: the test had no slack at all and was running on the one platform that couldn't meet the timing. Merkle was unaffected because it already bumped its own config to 120s.
Fix
Two changes, both test-only — prod
ClientConfig::default()is unchanged:DEFAULT_NODE_COUNTfromCLOSE_GROUP_SIZE + 1(8) toCLOSE_GROUP_SIZE * 2(14). One full extra group of slack. Up to 7 peers can be slow and the test still reaches quorum. Extra cost: ~1.2 s of spawn delay per test setup.New
test_client_config()helper intests/support/mod.rswithquote_timeout_secs = store_timeout_secs = 60. Alle2e_*files excepte2e_merkle.rs(which already had its own 120 s config) switch fromClientConfig::default()to this helper. 60 s is deliberately conservative — in the happy path everything completes in ~1 s, the extra budget only shows up on flakes.Both constants carry doc comments explaining why they're different from production, since the numbers look surprising at a glance.
Test plan
cargo fmt --all --check: cleancargo clippy --all-targets --all-features -- -D warnings: cleancargo test -p ant-core --test e2e_chunk test_chunk_put_get_round_trip: passes locally (macOS)cargo test -p ant-core --test e2e_security test_attack_corrupted_public_key: passes locally — this is the one that was flakingcargo test -p ant-core --test e2e_payment test_payment_required_enforcement: passes locally — the other flake