feat(cc): SEARCH#3518
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3518 +/- ##
==========================================
+ Coverage 94.56% 94.82% +0.26%
==========================================
Files 128 115 -13
Lines 39816 38102 -1714
Branches 39816 38102 -1714
==========================================
- Hits 37651 36131 -1520
+ Misses 1328 1268 -60
+ Partials 837 703 -134
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Only changes to function signatures on the SlowStart trait in this file.
Merging this PR will not alter performance
Comparing Footnotes
|
There was a problem hiding this comment.
Review: SEARCH slow-start implementation (draft-chung-ccwg-search-09)
Clean, well-structured implementation. The algorithm logic — bin-based delivery-rate tracking, interpolated lookback, and normalized-difference exit criterion — reads correctly against the draft. The SlowStart trait extension is minimal and the integration into ClassicCongestionController is straightforward.
Key observations:
-
Floating-point elimination: The earlier feedback from @larseggert about avoiding
f64has been addressed — all arithmetic now uses integer fixed-point withSCALE = 100. Nice. -
Division-by-zero risk: The
initializeguard only rejectsDuration::ZERORTTs, but the computedbin_durationcan still be zero for very small non-zero RTTs (see inline comment). This creates panic paths inupdate_binsandon_packets_acked. -
Drain-phase omission: The decision to skip the draft §3.2 drain-phase is well-reasoned given CUBIC's buffer-filling behavior. The
TODOfor telemetry capture of the drain-target is a good follow-up. -
Tests: The PR description notes tests will be added separately. The SEARCH algorithm has several interesting edge cases worth covering: initialization on first ACK, bin skipping/reset for stale data, the
prev_idx <= Wearly-return guard,compute_sentinterpolation boundaries, and the exit threshold itself. Looking forward to seeing those. -
Spec divergences are well-documented: The
NOTEcomments for divergences from draft-09 (reset mechanism, bit-shifting optimization, interpolation direction fix, drain-phase) are clear and cite the relevant draft sections — this is exactly the right approach for an implementation tracking an evolving draft.
No blocking issues found. The bin_duration == 0 panic path (inline) is the most important item to address.
There was a problem hiding this comment.
Pull request overview
Adds the SEARCH (Slow start Exit At Right CHokepoint) slow-start exit algorithm to neqo-transport congestion control, enabling heuristic slow-start exit based on delivery-rate flattening per the SEARCH draft.
Changes:
- Introduces
cc::Searchand wires it intoPacketSenderfor NewReno and Cubic. - Extends the
SlowStarttrait to accept per-ACK delivered bytes andnow, and per-send packet bytes; updates HyStart/ClassicSlowStart and related tests accordingly. - Updates clippy doc identifier allowlist for “CHokepoint”.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| neqo-transport/src/sender.rs | Wires SlowStart::Search into CC selection for NewReno/Cubic. |
| neqo-transport/src/cc/search.rs | New SEARCH slow-start exit algorithm implementation. |
| neqo-transport/src/cc/mod.rs | Registers and exports Search; adds SlowStart::Search enum variant. |
| neqo-transport/src/cc/classic_cc.rs | Extends SlowStart trait API; plumbs new args through CC core. |
| neqo-transport/src/cc/hystart.rs | Updates HyStart to new SlowStart trait signature. |
| neqo-transport/src/cc/classic_slow_start.rs | Updates ClassicSlowStart to new SlowStart trait signature. |
| neqo-transport/src/cc/tests/hystart.rs | Updates HyStart test suite to new SlowStart trait signature. |
| .clippy.toml | Allows “CHokepoint” in rustdoc identifiers. |
| /// Slow start Exit At Right CHokepoint (SEARCH). | ||
| /// | ||
| /// Exits slow start when the delivery rate flattens, indicating the network is | ||
| /// near its bottleneck capacity. | ||
| /// | ||
| /// <https://datatracker.ietf.org/doc/html/draft-chung-ccwg-search-09> | ||
| #[derive(Debug)] | ||
| pub struct Search { |
There was a problem hiding this comment.
SEARCH is a new slow-start exit algorithm and changes congestion-control behavior, but there are no unit/integration tests covering its exit conditions, resets (missed bins), and division-by-zero/edge cases. Please add tests alongside the existing cc test suites (e.g., similar to hystart.rs) before merging to prevent regressions.
| let bins_last_rtt = (rtt_nanos / bin_nanos) as usize; | ||
| let prev_idx = curr_idx.saturating_sub(bins_last_rtt); | ||
| let fraction = ((rtt_nanos % bin_nanos) * Self::SCALE as u128 / bin_nanos) as usize; |
There was a problem hiding this comment.
Warning
These u128 as usize casts will be flagged by clippy::cast_possible_truncation (enabled via the workspace pedantic lint group). update_bins already handles the identical pattern with the safer usize::try_from(...).unwrap_or(usize::MAX) — these should be consistent.
For bins_last_rtt, unwrap_or(usize::MAX) is correct since an oversized value is caught by the EXTRA_BINS guard below. For fraction, the value is provably < SCALE (100) so truncation is impossible, but clippy can't see that.
| let bins_last_rtt = (rtt_nanos / bin_nanos) as usize; | |
| let prev_idx = curr_idx.saturating_sub(bins_last_rtt); | |
| let fraction = ((rtt_nanos % bin_nanos) * Self::SCALE as u128 / bin_nanos) as usize; | |
| let bins_last_rtt = usize::try_from(rtt_nanos / bin_nanos).unwrap_or(usize::MAX); | |
| let prev_idx = curr_idx.saturating_sub(bins_last_rtt); | |
| let fraction = usize::try_from((rtt_nanos % bin_nanos) * Self::SCALE as u128 / bin_nanos) | |
| .unwrap_or(0); |
There was a problem hiding this comment.
What is our take in cases like this? Of course it is nice to guard against u128 as usize truncation in general, but in this case bin_nanos is derived from initial_rtt during initialization, so with realistic values rtt_nanos / bin_nanos should never not fit into a usize and those values are purely u128 because we use .as_nanos() for division accuracy.
Imo the readability here is worth more than the additional complexity of guarding against a truncation that won't ever happen in practice.
There was a problem hiding this comment.
Clean implementation of SEARCH draft-09. The algorithm logic — bin-based delivery-rate tracking, interpolated lookback via compute_sent, and normalized-difference exit criterion — reads correctly against the spec. Constants (W=10, THRESH=0.26, WINDOW_SIZE_FACTOR=3.50) all match.
The intentional divergences are well-documented with rationale and spec citations:
- Stale-data reset (§ missed bins): good addition for app-limited and flow-control-limited scenarios.
- Drain-phase omission (§3.2): well-reasoned given CUBIC's buffer-filling behavior; the TODO for telemetry capture is a good follow-up.
- Interpolation direction fix: correcting the draft's forward interpolation to backward — noted as discussed with the SEARCH team.
The SlowStart trait extension is minimal. Making on_packet_sent a default no-op is a good design choice — it keeps ClassicSlowStart simple while letting SEARCH and HyStart override only what they need. The new new_acked_bytes and now parameters are well-motivated by SEARCH's delivery-rate tracking needs.
Two inline findings (neither blocking): a clippy::cast_possible_truncation lint issue with as usize casts, and a minor 32-bit arithmetic overflow concern in compute_sent.
The previous bot-review items (typos, test parameter confusion) appear to have been addressed in the latest commit.
There was a problem hiding this comment.
Good implementation of SEARCH draft-09. The integer-math refactor from floating point is clean, the circular buffer indexing is correct (the prev_idx <= W guard ensures compute_sent never underflows on old - 1), and the spec divergences (no drain phase, optional missed-bin reset) are well-documented with rationale.
Architecture: The SlowStart trait extension (new_acked_bytes, now) is a reasonable approach — the new parameters are plumbed through cleanly and existing impls (ClassicSlowStart, HyStart) just ignore them. The on_packet_sent default body {} is a nice simplification vs. requiring every impl to provide a no-op.
Key items from this review:
- 32-bit overflow in
norm_diffcomputation (same class as the interpolation issue already discussed, but different location) u128 as usizecasts should useusize::try_fromfor consistency withupdate_binsdebug_assert!(false)gives no visibility in release builds — consider adding aqwarn!
Tests are still pending per the PR description — looking forward to seeing those cover the exit condition, missed-bin reset, and edge cases (zero/tiny RTT, bins_last_rtt = 0).
| return None; | ||
| } | ||
| let diff = prev_sent.saturating_sub(curr_delv); | ||
| let norm_diff = diff * Self::SCALE / prev_sent; |
There was a problem hiding this comment.
Warning
diff * Self::SCALE can overflow on 32-bit targets when diff > ~43 MB (same class of issue flagged for compute_sent, but at a different location). Since diff <= prev_sent the final result is bounded by SCALE, but the intermediate product overflows first.
Consider rewriting to avoid the widening multiplication:
| let norm_diff = diff * Self::SCALE / prev_sent; | |
| let norm_diff = diff / (prev_sent / Self::SCALE); |
That loses a bit of precision (integer division twice), but avoids overflow entirely. Alternatively, widen to u64:
let norm_diff = (diff as u64 * Self::SCALE as u64 / prev_sent as u64) as usize;| if self.bin_duration.is_zero() { | ||
| debug_assert!( | ||
| false, | ||
| "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | ||
| ); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Caution
When initialize returns early (zero bin_duration), curr_idx remains None, so on_packets_acked will re-attempt initialization on every subsequent ACK. That's fine for recovery once a valid RTT arrives.
However, the early return leaves bin_end as None and bin_duration as Duration::ZERO — so in release builds (debug_assert is a no-op) we silently enter a state where SEARCH is permanently disabled for this connection if every latest_rtt() value remains below ~3ns (yielding a zero bin_duration after integer division). While that's only a theoretical concern, the debug_assert!(false, ...) pattern is unconventional — a qdebug! or qwarn! would give visibility in release builds too.
| if self.bin_duration.is_zero() { | |
| debug_assert!( | |
| false, | |
| "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | |
| ); | |
| return; | |
| } | |
| if self.bin_duration.is_zero() { | |
| debug_assert!( | |
| false, | |
| "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | |
| ); | |
| qwarn!("SEARCH: bin_duration is zero (initial_rtt too small), skipping initialization"); | |
| return; | |
| } |
|
|
||
| // Initialize on first ACK. | ||
| if self.curr_idx.is_none() { | ||
| self.initialize(rtt, now); |
There was a problem hiding this comment.
Note
Per draft-09 §3.2 step 5, the algorithm initializes bin_duration using the RTT from the first ACK (initial_rtt). This value is then fixed for the lifetime of the SEARCH instance (until reset()). If the first RTT sample is an outlier (e.g., inflated by server processing delay), the bin grid will be permanently mis-calibrated — bins that are too wide reduce SEARCH's temporal resolution, and bins that are too narrow increase the chance of hitting the prev_idx > EXTRA_BINS bail-out.
This follows the draft, so no change needed now, but it might be worth tracking as a potential improvement: re-initializing with smoothed_rtt after a few samples could make SEARCH more robust to noisy first RTTs.
There was a problem hiding this comment.
This is a real concern - the first RTT sample is usually too high.
There was a problem hiding this comment.
Good point, but I'm not sure it's worth the tradeoff.
Since bin_duration fixes the grid for the byte-counters, re-initializing it would de-validate all collected samples thus far, which would delay the point where SEARCH has enough data to actually check for the exit point. Using smoothed_rtt we'd need to wait multiple RTTs until we have a good approximation.
On small bandwidth connections that might make us miss the detection window before loss occurs.
In contrast too big a bin_duration from a high first RTT also delays the point where we have passed enough bins for a SEARCH check, but by less than being idle for multiple RTTs.
One option would be something like min(first_rtt, second_rtt). Then we miss just one RTT and if the second one is actually lower this might be better than having an inflated bin_duration.
I assume the first RTT is usually higher because of handshake overhead? By how much usually?
There was a problem hiding this comment.
Can you re-bin existing samples if you detect the initial RTT was too high?
There was a problem hiding this comment.
No, I don't think so.
SEARCH records snapshots of the cumulative acked/sent bytes counters when crossing a bin_duration boundary. There is no way of knowing where those counters would've been at for the new bin_duration when changing it retrospectively.
I suggest we keep it as is per the spec for now to see how it performs but keep the min(first_rtt, second_rtt) in the back of our heads as potential tuning. I will make a note to also add metrics for this in my follow on SEARCH-metric PR, so we can see how often and by how much the first RTT is inflated.
Is my intuition tracking with your experience that if the first RTT is inflated the second will most likely already be back to normal, i.e. min(first_rtt, second_rtt) would be enough of a mechanism to prevent the too high value?
There was a problem hiding this comment.
Summary
Clean implementation of SEARCH 4.0 (draft-09) that integrates well via the existing SlowStart trait. The trait extension is nicely done — adding a default no-op for on_packet_sent avoids boilerplate in implementors that don't need it. Divergences from the spec (no drain phase, interpolation direction fix, optional reset on missed bins) are well-documented with rationale and spec links.
The circular buffer arithmetic is correct — I verified that NUM_ACKED_BINS (11) and NUM_SENT_BINS (26) are exactly sized to avoid aliasing given the access patterns in compute_delv and compute_sent, including the EXTRA_BINS guard at line 266.
The new tests cover initialization, bin advancement, skipped-bin propagation, reset/reinit, and the end-to-end exit condition. A few targeted edge-case tests would round out coverage (noted inline).
Most arithmetic concerns (32-bit overflow, bare as usize casts) were raised in prior review rounds — I've added concrete suggestions for the remaining instances. The debug_assert! + #[should_panic] interaction is a new finding that should be addressed before merge.
| #[test] | ||
| #[should_panic( | ||
| expected = "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | ||
| )] | ||
| fn initialization_with_zero_rtt() { | ||
| init_search(Duration::ZERO); | ||
| } |
There was a problem hiding this comment.
Warning
#[should_panic] relies on the debug_assert!(false, ...) in initialize(), which is stripped in release builds. If tests are ever run with --release, this test will fail (no panic fires, then init_search panics at the bin_end assertion with a different message that doesn't match expected).
Guard the test with #[cfg(debug_assertions)] to make this explicit:
| #[test] | |
| #[should_panic( | |
| expected = "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | |
| )] | |
| fn initialization_with_zero_rtt() { | |
| init_search(Duration::ZERO); | |
| } | |
| #[test] | |
| #[cfg(debug_assertions)] | |
| #[should_panic( | |
| expected = "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | |
| )] | |
| fn initialization_with_zero_rtt() { | |
| init_search(Duration::ZERO); | |
| } |
There was a problem hiding this comment.
Or just make it expect the bin_end assertion text.
| // `0..[Self::SCALE]` for interpolation in `compute_sent`. | ||
| let rtt_nanos = rtt.as_nanos(); | ||
| let bin_nanos = self.bin_duration.as_nanos(); | ||
| let bins_last_rtt = (rtt_nanos / bin_nanos) as usize; |
There was a problem hiding this comment.
Warning
Bare as usize casts from u128 here and on line 256 are inconsistent with the defensive usize::try_from(...).unwrap_or(usize::MAX) pattern used in update_bins (line 119). While bins_last_rtt is bounded in practice (RTT / bin_duration ≈ 2.86 for typical ratios), and fraction is always < 100, using the same style throughout would be more robust and consistent.
| let bins_last_rtt = (rtt_nanos / bin_nanos) as usize; | |
| let bins_last_rtt = usize::try_from(rtt_nanos / bin_nanos).unwrap_or(usize::MAX); |
| return None; | ||
| } | ||
| let diff = prev_sent.saturating_sub(curr_delv); | ||
| let norm_diff = diff * Self::SCALE / prev_sent; |
There was a problem hiding this comment.
Warning
diff * Self::SCALE can overflow usize on 32-bit targets when diff exceeds ~42 MB (usize::MAX / 100). This is reachable during slow start with large BDPs. Use saturating_mul for consistency with the rest of the arithmetic in this file:
| let norm_diff = diff * Self::SCALE / prev_sent; | |
| let norm_diff = diff.saturating_mul(Self::SCALE) / prev_sent; |
There was a problem hiding this comment.
Would the saturating_mul limit growth for BDPs > ~42MB?
| pn += 1; | ||
| bytes_this_round += bytes_this_round / 4; | ||
| } | ||
| } |
There was a problem hiding this comment.
Tip
The exit-condition test is good. Consider adding a targeted unit test for the prev_sent == 0 early-return path (line 276 of search.rs) — e.g., trigger a SEARCH check window where no packets were sent in the lookback range. Also, a test verifying that compute_sent interpolation produces the correct weighted value for a known fraction would increase confidence in the interpolation math independently of the end-to-end exit test.
| if self.bin_duration.is_zero() { | ||
| debug_assert!( | ||
| false, | ||
| "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | ||
| ); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Note
When initialize returns early here (release build, zero/tiny RTT), curr_idx stays None and SEARCH silently retries initialization on every subsequent ACK until a non-zero bin_duration results. This is safe but wasteful. Consider setting a flag to avoid the repeated retry, or logging a one-time warning so operators can diagnose why SEARCH never activates.
| if self.bin_duration.is_zero() { | |
| debug_assert!( | |
| false, | |
| "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | |
| ); | |
| return; | |
| } | |
| if self.bin_duration.is_zero() { | |
| debug_assert!( | |
| false, | |
| "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small" | |
| ); | |
| // Mark as permanently disabled to avoid retrying on every ACK. | |
| self.bin_end = Some(now); | |
| return; | |
| } |
(With this, the now <= bin_end check on line 240 would early-return on subsequent ACKs without re-entering initialize. Alternative: use a dedicated disabled: bool field if you prefer clarity over reusing bin_end.)
|
@omansfeld there are a lot of QNS timeouts that other recent PRs such as #3586 (comment) do not have. That is suspicious. |
- for byte/index calculations - also adds a helper for `prev_idx` calculation - fixes existing should_panic test
There was a problem hiding this comment.
Clean implementation of SEARCH 4.0 (draft-09). The SlowStart trait extension is minimal and well-designed (default no-op for on_packet_sent, new parameters plumbed cleanly). Spec divergences (no drain phase, missed-bin reset, interpolation direction fix) are well-documented with rationale and citations.
The new test suite covers the core paths: initialization, bin advancement, skipped-bin propagation, reset/reinit, cumulative byte tracking, prev_idx/fraction calculation, and the end-to-end exit condition. The #[should_panic] test is correctly gated with #[cfg(debug_assertions)].
Main concern: the 32-bit arithmetic overflow in norm_diff and compute_sent — see inline comments for why saturating_mul (previously suggested) is not a safe fix and would cause false slow-start exits on 32-bit targets. Widening to u64 is the right approach.
| return None; | ||
| } | ||
| let diff = prev_sent.saturating_sub(curr_delv); | ||
| let norm_diff = diff * Self::SCALE / prev_sent; |
There was a problem hiding this comment.
Caution
Re @larseggert's open question on the prior thread: saturating_mul is not safe here. When diff * SCALE saturates to usize::MAX on 32-bit, norm_diff = usize::MAX / prev_sent yields a huge value (e.g., ~95 for a 43 MB prev_sent), far exceeding THRESH=26. This causes a false slow-start exit, not limited growth.
Since neqo targets 32-bit Android (Firefox ARMv7), widen to u64 instead — the result is bounded by SCALE (100), so the final truncation is always safe:
| let norm_diff = diff * Self::SCALE / prev_sent; | |
| let norm_diff = (diff as u64 * Self::SCALE as u64 / prev_sent as u64) as usize; |
| const fn compute_sent(&self, old: usize, new: usize, fraction: usize) -> usize { | ||
| // NOTE: SEARCH draft-09 does forward interpolation here, i.e. `new + 1`/`old + 1`. That is | ||
| // a mistake in the draft and has been discussed with the SEARCH team. Subtracting | ||
| // is correct. | ||
| let mut sent = (self.sent_bins[(new - 1) % Self::NUM_SENT_BINS] | ||
| .saturating_sub(self.sent_bins[(old - 1) % Self::NUM_SENT_BINS])) | ||
| .saturating_mul(fraction); | ||
|
|
||
| sent = sent.saturating_add( | ||
| (self.sent_bins[new % Self::NUM_SENT_BINS] | ||
| .saturating_sub(self.sent_bins[old % Self::NUM_SENT_BINS])) | ||
| .saturating_mul(Self::SCALE - fraction), | ||
| ); | ||
| sent / Self::SCALE | ||
| } |
There was a problem hiding this comment.
Warning
Same class of 32-bit overflow as the norm_diff computation. compute_sent returns sent / Self::SCALE, and if the intermediate products saturated to usize::MAX, the result is usize::MAX / 100 ≈ 42.9M on 32-bit — which inflates prev_sent and can trigger a false slow-start exit downstream.
Consider widening the interpolation to u64 as well:
| const fn compute_sent(&self, old: usize, new: usize, fraction: usize) -> usize { | |
| // NOTE: SEARCH draft-09 does forward interpolation here, i.e. `new + 1`/`old + 1`. That is | |
| // a mistake in the draft and has been discussed with the SEARCH team. Subtracting | |
| // is correct. | |
| let mut sent = (self.sent_bins[(new - 1) % Self::NUM_SENT_BINS] | |
| .saturating_sub(self.sent_bins[(old - 1) % Self::NUM_SENT_BINS])) | |
| .saturating_mul(fraction); | |
| sent = sent.saturating_add( | |
| (self.sent_bins[new % Self::NUM_SENT_BINS] | |
| .saturating_sub(self.sent_bins[old % Self::NUM_SENT_BINS])) | |
| .saturating_mul(Self::SCALE - fraction), | |
| ); | |
| sent / Self::SCALE | |
| } | |
| const fn compute_sent(&self, old: usize, new: usize, fraction: usize) -> usize { | |
| // NOTE: SEARCH draft-09 does forward interpolation here, i.e. `new + 1`/`old + 1`. That is | |
| // a mistake in the draft and has been discussed with the SEARCH team. Subtracting | |
| // is correct. | |
| let a = (self.sent_bins[(new - 1) % Self::NUM_SENT_BINS] | |
| .saturating_sub(self.sent_bins[(old - 1) % Self::NUM_SENT_BINS])) | |
| as u64 | |
| * fraction as u64; | |
| let b = (self.sent_bins[new % Self::NUM_SENT_BINS] | |
| .saturating_sub(self.sent_bins[old % Self::NUM_SENT_BINS])) | |
| as u64 | |
| * (Self::SCALE - fraction) as u64; | |
| (a.saturating_add(b) / Self::SCALE as u64) as usize | |
| } |
| pn += 1; | ||
| bytes_this_round += bytes_this_round / 4; | ||
| } | ||
| } |
There was a problem hiding this comment.
Note
The curr_idx - prev_idx >= EXTRA_BINS guard on search.rs:294 (which handles excessive RTT inflation making the lookback impossible) has no test coverage. Consider adding a test that simulates significant RTT inflation (e.g., initial_rtt=100ms then rtt=600ms, giving bins_last_rtt ≈ 17 > EXTRA_BINS=15) and verifies SEARCH returns None rather than using stale data.
Benchmark resultsSignificant performance differences relative to 8c18132. transfer/1-conn/1-100mb-resp (aka. Download): 💔 Performance has regressed by +2.9155%. time: [206.48 ms 206.78 ms 207.12 ms]
thrpt: [482.82 MiB/s 483.61 MiB/s 484.32 MiB/s]
change:
time: [+2.6759% +2.9155% +3.1529] (p = 0.00 < 0.05)
thrpt: [-3.0566% -2.8329% -2.6062]
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload): 💔 Performance has regressed by +3.1347%. time: [208.20 ms 208.61 ms 209.03 ms]
thrpt: [478.39 MiB/s 479.36 MiB/s 480.31 MiB/s]
change:
time: [+2.8432% +3.1347% +3.4175] (p = 0.00 < 0.05)
thrpt: [-3.3046% -3.0394% -2.7646]
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severestreams/walltime/1000-streams/each-1000-bytes: 💚 Performance has improved by -1.4408%. time: [44.793 ms 44.837 ms 44.882 ms]
change: [-1.7250% -1.4408% -1.2257] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severeAll resultstransfer/1-conn/1-100mb-resp (aka. Download): 💔 Performance has regressed by +2.9155%. time: [206.48 ms 206.78 ms 207.12 ms]
thrpt: [482.82 MiB/s 483.61 MiB/s 484.32 MiB/s]
change:
time: [+2.6759% +2.9155% +3.1529] (p = 0.00 < 0.05)
thrpt: [-3.0566% -2.8329% -2.6062]
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severetransfer/1-conn/10_000-parallel-1b-resp (aka. RPS): No change in performance detected. time: [284.33 ms 286.17 ms 288.04 ms]
thrpt: [34.718 Kelem/s 34.944 Kelem/s 35.171 Kelem/s]
change:
time: [-1.4719% -0.5999% +0.2562] (p = 0.19 > 0.05)
thrpt: [-0.2555% +0.6036% +1.4939]
No change in performance detected.transfer/1-conn/1-1b-resp (aka. HPS): No change in performance detected. time: [38.667 ms 38.825 ms 39.002 ms]
thrpt: [25.639 B/s 25.757 B/s 25.862 B/s]
change:
time: [-1.2984% -0.5996% +0.0926] (p = 0.10 > 0.05)
thrpt: [-0.0925% +0.6032% +1.3155]
No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) high mild
7 (7.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload): 💔 Performance has regressed by +3.1347%. time: [208.20 ms 208.61 ms 209.03 ms]
thrpt: [478.39 MiB/s 479.36 MiB/s 480.31 MiB/s]
change:
time: [+2.8432% +3.1347% +3.4175] (p = 0.00 < 0.05)
thrpt: [-3.3046% -3.0394% -2.7646]
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severestreams/walltime/1-streams/each-1000-bytes: Change within noise threshold. time: [592.80 µs 594.60 µs 596.72 µs]
change: [+0.0155% +0.6357% +1.2321] (p = 0.04 < 0.05)
Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
7 (7.00%) high severestreams/walltime/1000-streams/each-1-bytes: Change within noise threshold. time: [12.232 ms 12.253 ms 12.275 ms]
change: [-0.8030% -0.5649% -0.3139] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severestreams/walltime/1000-streams/each-1000-bytes: 💚 Performance has improved by -1.4408%. time: [44.793 ms 44.837 ms 44.882 ms]
change: [-1.7250% -1.4408% -1.2257] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severetransfer/walltime/pacing-false/varying-seeds: Change within noise threshold. time: [23.362 ms 23.378 ms 23.393 ms]
change: [-0.8000% -0.6899% -0.5768] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mildtransfer/walltime/pacing-true/varying-seeds: Change within noise threshold. time: [23.755 ms 23.770 ms 23.785 ms]
change: [+0.2200% +0.3117% +0.4038] (p = 0.00 < 0.05)
Change within noise threshold.transfer/walltime/pacing-false/same-seed: Change within noise threshold. time: [23.770 ms 23.787 ms 23.804 ms]
change: [+0.9409% +1.0425% +1.1427] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildtransfer/walltime/pacing-true/same-seed: Change within noise threshold. time: [24.048 ms 24.074 ms 24.109 ms]
change: [+0.0495% +0.1853% +0.3413] (p = 0.01 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severeDownload data for |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
|
Client/server transfer resultsPerformance differences relative to 8c18132. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Table above only shows statistically significant changes. See all results below. All resultsTransfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
|
The Probably the easiest solution is to just widen to u64 for intermediate results when multiplying with |
SEARCH 4.0 implementation.
This is still missing tests, which I will add later to this PR, but I thought it makes sense to get the implementation into review already while I work on that.Update 2026-05-04: Have added the tests.
SEARCH specific metrics will be added in a follow-up PR.
I've marked divergences from the most recent draft-09 including reasoning for them in code comments, prefaced by
NOTE.In my limited manual testing during development SEARCH seems to exit slow start before loss very reliably -- more so than HyStart++. Exemplary output and qvis graph from neqo-client cli to cloudflare:
cc: @maryam-ataei (from SEARCH team) FYI the current implementation going into review.