feat(cc): SEARCH by omansfeld · Pull Request #3518 · mozilla/neqo

omansfeld · 2026-03-31T12:58:03Z

SEARCH 4.0 implementation.

~~This is still missing tests, which I will add later to this PR, but I thought it makes sense to get the implementation into review already while I work on that.~~

Update 2026-05-04: Have added the tests.

SEARCH specific metrics will be added in a follow-up PR.

I've marked divergences from the most recent draft-09 including reasoning for them in code comments, prefaced by NOTE.

In my limited manual testing during development SEARCH seems to exit slow start before loss very reliably -- more so than HyStart++. Exemplary output and qvis graph from neqo-client cli to cloudflare:

9.625 INFO stats for Client ...
  rx: 26034 drop 3 dup 0 saved 3
  tx: 35991 lost 8 lateack 0 ptoack 0 unackdrop 0
  cc:
    ce_loss 5 ce_ecn 0 ce_spurious 0
    final_cwnd Some(181065) ss_exit_cwnd Some(465764) ss_exit_reason Some(Heuristic)
[...]

cc: @maryam-ataei (from SEARCH team) FYI the current implementation going into review.

codecov · 2026-03-31T13:03:01Z

Codecov Report

❌ Patch coverage is 89.01734% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.82%. Comparing base (f7d28ed) to head (77ef539).
⚠️ Report is 12 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3518      +/-   ##
==========================================
+ Coverage   94.56%   94.82%   +0.26%     
==========================================
  Files         128      115      -13     
  Lines       39816    38102    -1714     
  Branches    39816    38102    -1714     
==========================================
- Hits        37651    36131    -1520     
+ Misses       1328     1268      -60     
+ Partials      837      703     -134

Flag	Coverage Δ
freebsd	`93.94% <89.01%> (+0.31%)`	⬆️
linux	`94.98% <89.01%> (+0.45%)`	⬆️
macos	`94.92% <89.01%> (+0.44%)`	⬆️
windows	`94.97% <89.01%> (+0.45%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
neqo-common	`98.61% <ø> (ø)`
neqo-crypto	`∅ <ø> (∅)`
neqo-http3	`93.92% <ø> (ø)`
neqo-qpack	`95.14% <ø> (ø)`
neqo-transport	`95.54% <89.01%> (-0.07%)`	⬇️
neqo-udp	`84.90% <ø> (ø)`
mtu	`86.61% <ø> (ø)`

omansfeld · 2026-04-01T10:42:46Z

Only changes to function signatures on the SlowStart trait in this file.

codspeed-hq · 2026-04-27T12:21:25Z

Merging this PR will not alter performance

✅ 24 untouched benchmarks
⏩ 27 skipped benchmarks¹

_{Comparing omansfeld:search (77ef539) with main (8c18132)}

27 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions

Review: SEARCH slow-start implementation (draft-chung-ccwg-search-09)

Clean, well-structured implementation. The algorithm logic — bin-based delivery-rate tracking, interpolated lookback, and normalized-difference exit criterion — reads correctly against the draft. The SlowStart trait extension is minimal and the integration into ClassicCongestionController is straightforward.

Key observations:

Floating-point elimination: The earlier feedback from @larseggert about avoiding f64 has been addressed — all arithmetic now uses integer fixed-point with SCALE = 100. Nice.
Division-by-zero risk: The initialize guard only rejects Duration::ZERO RTTs, but the computed bin_duration can still be zero for very small non-zero RTTs (see inline comment). This creates panic paths in update_bins and on_packets_acked.
Drain-phase omission: The decision to skip the draft §3.2 drain-phase is well-reasoned given CUBIC's buffer-filling behavior. The TODO for telemetry capture of the drain-target is a good follow-up.
Tests: The PR description notes tests will be added separately. The SEARCH algorithm has several interesting edge cases worth covering: initialization on first ACK, bin skipping/reset for stale data, the prev_idx <= W early-return guard, compute_sent interpolation boundaries, and the exit threshold itself. Looking forward to seeing those.
Spec divergences are well-documented: The NOTE comments for divergences from draft-09 (reset mechanism, bit-shifting optimization, interpolation direction fix, drain-phase) are clear and cite the relevant draft sections — this is exactly the right approach for an implementation tracking an evolving draft.

No blocking issues found. The bin_duration == 0 panic path (inline) is the most important item to address.

Copilot

Pull request overview

Adds the SEARCH (Slow start Exit At Right CHokepoint) slow-start exit algorithm to neqo-transport congestion control, enabling heuristic slow-start exit based on delivery-rate flattening per the SEARCH draft.

Changes:

Introduces cc::Search and wires it into PacketSender for NewReno and Cubic.
Extends the SlowStart trait to accept per-ACK delivered bytes and now, and per-send packet bytes; updates HyStart/ClassicSlowStart and related tests accordingly.
Updates clippy doc identifier allowlist for “CHokepoint”.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
neqo-transport/src/sender.rs	Wires `SlowStart::Search` into CC selection for NewReno/Cubic.
neqo-transport/src/cc/search.rs	New SEARCH slow-start exit algorithm implementation.
neqo-transport/src/cc/mod.rs	Registers and exports `Search`; adds `SlowStart::Search` enum variant.
neqo-transport/src/cc/classic_cc.rs	Extends `SlowStart` trait API; plumbs new args through CC core.
neqo-transport/src/cc/hystart.rs	Updates HyStart to new `SlowStart` trait signature.
neqo-transport/src/cc/classic_slow_start.rs	Updates ClassicSlowStart to new `SlowStart` trait signature.
neqo-transport/src/cc/tests/hystart.rs	Updates HyStart test suite to new `SlowStart` trait signature.
.clippy.toml	Allows “CHokepoint” in rustdoc identifiers.

Copilot · 2026-04-27T12:38:13Z

+/// Slow start Exit At Right CHokepoint (SEARCH).
+///
+/// Exits slow start when the delivery rate flattens, indicating the network is
+/// near its bottleneck capacity.
+///
+/// <https://datatracker.ietf.org/doc/html/draft-chung-ccwg-search-09>
+#[derive(Debug)]
+pub struct Search {


SEARCH is a new slow-start exit algorithm and changes congestion-control behavior, but there are no unit/integration tests covering its exit conditions, resets (missed bins), and division-by-zero/edge cases. Please add tests alongside the existing cc test suites (e.g., similar to hystart.rs) before merging to prevent regressions.

github-actions · 2026-04-27T18:03:30Z

+        let bins_last_rtt = (rtt_nanos / bin_nanos) as usize;
+        let prev_idx = curr_idx.saturating_sub(bins_last_rtt);
+        let fraction = ((rtt_nanos % bin_nanos) * Self::SCALE as u128 / bin_nanos) as usize;


Warning
These u128 as usize casts will be flagged by clippy::cast_possible_truncation (enabled via the workspace pedantic lint group). update_bins already handles the identical pattern with the safer usize::try_from(...).unwrap_or(usize::MAX) — these should be consistent.

For bins_last_rtt, unwrap_or(usize::MAX) is correct since an oversized value is caught by the EXTRA_BINS guard below. For fraction, the value is provably < SCALE (100) so truncation is impossible, but clippy can't see that.

Suggested change

let bins_last_rtt = (rtt_nanos / bin_nanos) as usize;

let prev_idx = curr_idx.saturating_sub(bins_last_rtt);

let fraction = ((rtt_nanos % bin_nanos) * Self::SCALE as u128 / bin_nanos) as usize;

let bins_last_rtt = usize::try_from(rtt_nanos / bin_nanos).unwrap_or(usize::MAX);

let prev_idx = curr_idx.saturating_sub(bins_last_rtt);

let fraction = usize::try_from((rtt_nanos % bin_nanos) * Self::SCALE as u128 / bin_nanos)

.unwrap_or(0);

What is our take in cases like this? Of course it is nice to guard against u128 as usize truncation in general, but in this case bin_nanos is derived from initial_rtt during initialization, so with realistic values rtt_nanos / bin_nanos should never not fit into a usize and those values are purely u128 because we use .as_nanos() for division accuracy.

Imo the readability here is worth more than the additional complexity of guarding against a truncation that won't ever happen in practice.

github-actions

Clean implementation of SEARCH draft-09. The algorithm logic — bin-based delivery-rate tracking, interpolated lookback via compute_sent, and normalized-difference exit criterion — reads correctly against the spec. Constants (W=10, THRESH=0.26, WINDOW_SIZE_FACTOR=3.50) all match.

The intentional divergences are well-documented with rationale and spec citations:

Stale-data reset (§ missed bins): good addition for app-limited and flow-control-limited scenarios.
Drain-phase omission (§3.2): well-reasoned given CUBIC's buffer-filling behavior; the TODO for telemetry capture is a good follow-up.
Interpolation direction fix: correcting the draft's forward interpolation to backward — noted as discussed with the SEARCH team.

The SlowStart trait extension is minimal. Making on_packet_sent a default no-op is a good design choice — it keeps ClassicSlowStart simple while letting SEARCH and HyStart override only what they need. The new new_acked_bytes and now parameters are well-motivated by SEARCH's delivery-rate tracking needs.

Two inline findings (neither blocking): a clippy::cast_possible_truncation lint issue with as usize casts, and a minor 32-bit arithmetic overflow concern in compute_sent.

The previous bot-review items (typos, test parameter confusion) appear to have been addressed in the latest commit.

github-actions

Good implementation of SEARCH draft-09. The integer-math refactor from floating point is clean, the circular buffer indexing is correct (the prev_idx <= W guard ensures compute_sent never underflows on old - 1), and the spec divergences (no drain phase, optional missed-bin reset) are well-documented with rationale.

Architecture: The SlowStart trait extension (new_acked_bytes, now) is a reasonable approach — the new parameters are plumbed through cleanly and existing impls (ClassicSlowStart, HyStart) just ignore them. The on_packet_sent default body {} is a nice simplification vs. requiring every impl to provide a no-op.

Key items from this review:

32-bit overflow in norm_diff computation (same class as the interpolation issue already discussed, but different location)
u128 as usize casts should use usize::try_from for consistency with update_bins
debug_assert!(false) gives no visibility in release builds — consider adding a qwarn!

Tests are still pending per the PR description — looking forward to seeing those cover the exit condition, missed-bin reset, and edge cases (zero/tiny RTT, bins_last_rtt = 0).

github-actions · 2026-04-28T12:27:26Z

+            return None;
+        }
+        let diff = prev_sent.saturating_sub(curr_delv);
+        let norm_diff = diff * Self::SCALE / prev_sent;


Warning
diff * Self::SCALE can overflow on 32-bit targets when diff > ~43 MB (same class of issue flagged for compute_sent, but at a different location). Since diff <= prev_sent the final result is bounded by SCALE, but the intermediate product overflows first.

Consider rewriting to avoid the widening multiplication:

Suggested change

let norm_diff = diff * Self::SCALE / prev_sent;

let norm_diff = diff / (prev_sent / Self::SCALE);

That loses a bit of precision (integer division twice), but avoids overflow entirely. Alternatively, widen to u64:

let norm_diff = (diff as u64 * Self::SCALE as u64 / prev_sent as u64) as usize;

github-actions · 2026-04-28T12:27:46Z

+        if self.bin_duration.is_zero() {
+            debug_assert!(
+                false,
+                "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"
+            );
+            return;
+        }


Caution
When initialize returns early (zero bin_duration), curr_idx remains None, so on_packets_acked will re-attempt initialization on every subsequent ACK. That's fine for recovery once a valid RTT arrives.

However, the early return leaves bin_end as None and bin_duration as Duration::ZERO — so in release builds (debug_assert is a no-op) we silently enter a state where SEARCH is permanently disabled for this connection if every latest_rtt() value remains below ~3ns (yielding a zero bin_duration after integer division). While that's only a theoretical concern, the debug_assert!(false, ...) pattern is unconventional — a qdebug! or qwarn! would give visibility in release builds too.

Suggested change

if self.bin_duration.is_zero() {

debug_assert!(

false,

"bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"

);

return;

}

if self.bin_duration.is_zero() {

debug_assert!(

false,

"bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"

);

qwarn!("SEARCH: bin_duration is zero (initial_rtt too small), skipping initialization");

return;

}

github-actions · 2026-04-28T12:27:53Z

+
+        // Initialize on first ACK.
+        if self.curr_idx.is_none() {
+            self.initialize(rtt, now);


Note
Per draft-09 §3.2 step 5, the algorithm initializes bin_duration using the RTT from the first ACK (initial_rtt). This value is then fixed for the lifetime of the SEARCH instance (until reset()). If the first RTT sample is an outlier (e.g., inflated by server processing delay), the bin grid will be permanently mis-calibrated — bins that are too wide reduce SEARCH's temporal resolution, and bins that are too narrow increase the chance of hitting the prev_idx > EXTRA_BINS bail-out.

This follows the draft, so no change needed now, but it might be worth tracking as a potential improvement: re-initializing with smoothed_rtt after a few samples could make SEARCH more robust to noisy first RTTs.

This is a real concern - the first RTT sample is usually too high.

Good point, but I'm not sure it's worth the tradeoff.

Since bin_duration fixes the grid for the byte-counters, re-initializing it would de-validate all collected samples thus far, which would delay the point where SEARCH has enough data to actually check for the exit point. Using smoothed_rtt we'd need to wait multiple RTTs until we have a good approximation.

On small bandwidth connections that might make us miss the detection window before loss occurs.

In contrast too big a bin_duration from a high first RTT also delays the point where we have passed enough bins for a SEARCH check, but by less than being idle for multiple RTTs.

One option would be something like min(first_rtt, second_rtt). Then we miss just one RTT and if the second one is actually lower this might be better than having an inflated bin_duration.

I assume the first RTT is usually higher because of handshake overhead? By how much usually?

Can you re-bin existing samples if you detect the initial RTT was too high?

No, I don't think so.

SEARCH records snapshots of the cumulative acked/sent bytes counters when crossing a bin_duration boundary. There is no way of knowing where those counters would've been at for the new bin_duration when changing it retrospectively.

I suggest we keep it as is per the spec for now to see how it performs but keep the min(first_rtt, second_rtt) in the back of our heads as potential tuning. I will make a note to also add metrics for this in my follow on SEARCH-metric PR, so we can see how often and by how much the first RTT is inflated.

Is my intuition tracking with your experience that if the first RTT is inflated the second will most likely already be back to normal, i.e. min(first_rtt, second_rtt) would be enough of a mechanism to prevent the too high value?

github-actions

Summary

Clean implementation of SEARCH 4.0 (draft-09) that integrates well via the existing SlowStart trait. The trait extension is nicely done — adding a default no-op for on_packet_sent avoids boilerplate in implementors that don't need it. Divergences from the spec (no drain phase, interpolation direction fix, optional reset on missed bins) are well-documented with rationale and spec links.

The circular buffer arithmetic is correct — I verified that NUM_ACKED_BINS (11) and NUM_SENT_BINS (26) are exactly sized to avoid aliasing given the access patterns in compute_delv and compute_sent, including the EXTRA_BINS guard at line 266.

The new tests cover initialization, bin advancement, skipped-bin propagation, reset/reinit, and the end-to-end exit condition. A few targeted edge-case tests would round out coverage (noted inline).

Most arithmetic concerns (32-bit overflow, bare as usize casts) were raised in prior review rounds — I've added concrete suggestions for the remaining instances. The debug_assert! + #[should_panic] interaction is a new finding that should be addressed before merge.

github-actions · 2026-05-04T17:13:30Z

+#[test]
+#[should_panic(
+    expected = "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"
+)]
+fn initialization_with_zero_rtt() {
+    init_search(Duration::ZERO);
+}


Warning
#[should_panic] relies on the debug_assert!(false, ...) in initialize(), which is stripped in release builds. If tests are ever run with --release, this test will fail (no panic fires, then init_search panics at the bin_end assertion with a different message that doesn't match expected).

Guard the test with #[cfg(debug_assertions)] to make this explicit:

Suggested change

#[test]

#[should_panic(

expected = "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"

)]

fn initialization_with_zero_rtt() {

init_search(Duration::ZERO);

}

#[test]

#[cfg(debug_assertions)]

#[should_panic(

expected = "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"

)]

fn initialization_with_zero_rtt() {

init_search(Duration::ZERO);

}

Or just make it expect the bin_end assertion text.

github-actions · 2026-05-04T17:13:46Z

+        // `0..[Self::SCALE]` for interpolation in `compute_sent`.
+        let rtt_nanos = rtt.as_nanos();
+        let bin_nanos = self.bin_duration.as_nanos();
+        let bins_last_rtt = (rtt_nanos / bin_nanos) as usize;


Warning
Bare as usize casts from u128 here and on line 256 are inconsistent with the defensive usize::try_from(...).unwrap_or(usize::MAX) pattern used in update_bins (line 119). While bins_last_rtt is bounded in practice (RTT / bin_duration ≈ 2.86 for typical ratios), and fraction is always < 100, using the same style throughout would be more robust and consistent.

Suggested change

let bins_last_rtt = (rtt_nanos / bin_nanos) as usize;

let bins_last_rtt = usize::try_from(rtt_nanos / bin_nanos).unwrap_or(usize::MAX);

github-actions · 2026-05-04T17:13:53Z

+            return None;
+        }
+        let diff = prev_sent.saturating_sub(curr_delv);
+        let norm_diff = diff * Self::SCALE / prev_sent;


Warning
diff * Self::SCALE can overflow usize on 32-bit targets when diff exceeds ~42 MB (usize::MAX / 100). This is reachable during slow start with large BDPs. Use saturating_mul for consistency with the rest of the arithmetic in this file:

Suggested change

let norm_diff = diff * Self::SCALE / prev_sent;

let norm_diff = diff.saturating_mul(Self::SCALE) / prev_sent;

Would the saturating_mul limit growth for BDPs > ~42MB?

github-actions · 2026-05-04T17:14:00Z

+        pn += 1;
+        bytes_this_round += bytes_this_round / 4;
+    }
+}


Tip
The exit-condition test is good. Consider adding a targeted unit test for the prev_sent == 0 early-return path (line 276 of search.rs) — e.g., trigger a SEARCH check window where no packets were sent in the lookback range. Also, a test verifying that compute_sent interpolation produces the correct weighted value for a known fraction would increase confidence in the interpolation math independently of the end-to-end exit test.

github-actions · 2026-05-04T17:14:12Z

+        if self.bin_duration.is_zero() {
+            debug_assert!(
+                false,
+                "bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"
+            );
+            return;
+        }


Note
When initialize returns early here (release build, zero/tiny RTT), curr_idx stays None and SEARCH silently retries initialization on every subsequent ACK until a non-zero bin_duration results. This is safe but wasteful. Consider setting a flag to avoid the repeated retry, or logging a one-time warning so operators can diagnose why SEARCH never activates.

Suggested change

if self.bin_duration.is_zero() {

debug_assert!(

false,

"bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"

);

return;

}

if self.bin_duration.is_zero() {

debug_assert!(

false,

"bin_duration must be non-zero for correctness and to guard against div by zero -- initial_rtt was zero or too small"

);

// Mark as permanently disabled to avoid retrying on every ACK.

self.bin_end = Some(now);

return;

}

(With this, the now <= bin_end check on line 240 would early-return on subsequent ACKs without re-entering initialize. Alternative: use a dedicated disabled: bool field if you prefer clarity over reusing bin_end.)

larseggert · 2026-05-04T18:10:17Z

@omansfeld there are a lot of QNS timeouts that other recent PRs such as #3586 (comment) do not have. That is suspicious.

- for byte/index calculations - also adds a helper for `prev_idx` calculation - fixes existing should_panic test

github-actions

Clean implementation of SEARCH 4.0 (draft-09). The SlowStart trait extension is minimal and well-designed (default no-op for on_packet_sent, new parameters plumbed cleanly). Spec divergences (no drain phase, missed-bin reset, interpolation direction fix) are well-documented with rationale and citations.

The new test suite covers the core paths: initialization, bin advancement, skipped-bin propagation, reset/reinit, cumulative byte tracking, prev_idx/fraction calculation, and the end-to-end exit condition. The #[should_panic] test is correctly gated with #[cfg(debug_assertions)].

Main concern: the 32-bit arithmetic overflow in norm_diff and compute_sent — see inline comments for why saturating_mul (previously suggested) is not a safe fix and would cause false slow-start exits on 32-bit targets. Widening to u64 is the right approach.

github-actions · 2026-05-04T18:44:04Z

+            return None;
+        }
+        let diff = prev_sent.saturating_sub(curr_delv);
+        let norm_diff = diff * Self::SCALE / prev_sent;


Caution
Re @larseggert's open question on the prior thread: saturating_mul is not safe here. When diff * SCALE saturates to usize::MAX on 32-bit, norm_diff = usize::MAX / prev_sent yields a huge value (e.g., ~95 for a 43 MB prev_sent), far exceeding THRESH=26. This causes a false slow-start exit, not limited growth.

Since neqo targets 32-bit Android (Firefox ARMv7), widen to u64 instead — the result is bounded by SCALE (100), so the final truncation is always safe:

Suggested change

let norm_diff = diff * Self::SCALE / prev_sent;

let norm_diff = (diff as u64 * Self::SCALE as u64 / prev_sent as u64) as usize;

github-actions · 2026-05-04T18:44:15Z

+    const fn compute_sent(&self, old: usize, new: usize, fraction: usize) -> usize {
+        // NOTE: SEARCH draft-09 does forward interpolation here, i.e. `new + 1`/`old + 1`. That is
+        // a mistake in the draft and has been discussed with the SEARCH team. Subtracting
+        // is correct.
+        let mut sent = (self.sent_bins[(new - 1) % Self::NUM_SENT_BINS]
+            .saturating_sub(self.sent_bins[(old - 1) % Self::NUM_SENT_BINS]))
+        .saturating_mul(fraction);
+
+        sent = sent.saturating_add(
+            (self.sent_bins[new % Self::NUM_SENT_BINS]
+                .saturating_sub(self.sent_bins[old % Self::NUM_SENT_BINS]))
+            .saturating_mul(Self::SCALE - fraction),
+        );
+        sent / Self::SCALE
+    }


Warning
Same class of 32-bit overflow as the norm_diff computation. compute_sent returns sent / Self::SCALE, and if the intermediate products saturated to usize::MAX, the result is usize::MAX / 100 ≈ 42.9M on 32-bit — which inflates prev_sent and can trigger a false slow-start exit downstream.

Consider widening the interpolation to u64 as well:

Suggested change

const fn compute_sent(&self, old: usize, new: usize, fraction: usize) -> usize {

// NOTE: SEARCH draft-09 does forward interpolation here, i.e. `new + 1`/`old + 1`. That is

// a mistake in the draft and has been discussed with the SEARCH team. Subtracting

// is correct.

let mut sent = (self.sent_bins[(new - 1) % Self::NUM_SENT_BINS]

.saturating_sub(self.sent_bins[(old - 1) % Self::NUM_SENT_BINS]))

.saturating_mul(fraction);

sent = sent.saturating_add(

(self.sent_bins[new % Self::NUM_SENT_BINS]

.saturating_sub(self.sent_bins[old % Self::NUM_SENT_BINS]))

.saturating_mul(Self::SCALE - fraction),

);

sent / Self::SCALE

}

const fn compute_sent(&self, old: usize, new: usize, fraction: usize) -> usize {

// NOTE: SEARCH draft-09 does forward interpolation here, i.e. `new + 1`/`old + 1`. That is

// a mistake in the draft and has been discussed with the SEARCH team. Subtracting

// is correct.

let a = (self.sent_bins[(new - 1) % Self::NUM_SENT_BINS]

.saturating_sub(self.sent_bins[(old - 1) % Self::NUM_SENT_BINS]))

as u64

* fraction as u64;

let b = (self.sent_bins[new % Self::NUM_SENT_BINS]

.saturating_sub(self.sent_bins[old % Self::NUM_SENT_BINS]))

as u64

* (Self::SCALE - fraction) as u64;

(a.saturating_add(b) / Self::SCALE as u64) as usize

}

github-actions · 2026-05-04T18:44:18Z

+        pn += 1;
+        bytes_this_round += bytes_this_round / 4;
+    }
+}


Note
The curr_idx - prev_idx >= EXTRA_BINS guard on search.rs:294 (which handles excessive RTT inflation making the lookback impossible) has no test coverage. Consider adding a test that simulates significant RTT inflation (e.g., initial_rtt=100ms then rtt=600ms, giving bins_last_rtt ≈ 17 > EXTRA_BINS=15) and verifies SEARCH returns None rather than using stale data.

github-actions · 2026-05-04T18:54:00Z

Benchmark results

Significant performance differences relative to 8c18132.

transfer/1-conn/1-100mb-resp (aka. Download): 💔 Performance has regressed by +2.9155%.

       time:   [206.48 ms 206.78 ms 207.12 ms]
       thrpt:  [482.82 MiB/s 483.61 MiB/s 484.32 MiB/s]
change:
       time:   [+2.6759% +2.9155% +3.1529] (p = 0.00 < 0.05)
       thrpt:  [-3.0566% -2.8329% -2.6062]
       Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severe

transfer/1-conn/1-100mb-req (aka. Upload): 💔 Performance has regressed by +3.1347%.

       time:   [208.20 ms 208.61 ms 209.03 ms]
       thrpt:  [478.39 MiB/s 479.36 MiB/s 480.31 MiB/s]
change:
       time:   [+2.8432% +3.1347% +3.4175] (p = 0.00 < 0.05)
       thrpt:  [-3.3046% -3.0394% -2.7646]
       Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe

streams/walltime/1000-streams/each-1000-bytes: 💚 Performance has improved by -1.4408%.

       time:   [44.793 ms 44.837 ms 44.882 ms]
       change: [-1.7250% -1.4408% -1.2257] (p = 0.00 < 0.05)
       Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

All results

transfer/1-conn/1-100mb-resp (aka. Download): 💔 Performance has regressed by +2.9155%.

       time:   [206.48 ms 206.78 ms 207.12 ms]
       thrpt:  [482.82 MiB/s 483.61 MiB/s 484.32 MiB/s]
change:
       time:   [+2.6759% +2.9155% +3.1529] (p = 0.00 < 0.05)
       thrpt:  [-3.0566% -2.8329% -2.6062]
       Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severe

transfer/1-conn/10_000-parallel-1b-resp (aka. RPS): No change in performance detected.

       time:   [284.33 ms 286.17 ms 288.04 ms]
       thrpt:  [34.718 Kelem/s 34.944 Kelem/s 35.171 Kelem/s]
change:
       time:   [-1.4719% -0.5999% +0.2562] (p = 0.19 > 0.05)
       thrpt:  [-0.2555% +0.6036% +1.4939]
       No change in performance detected.

transfer/1-conn/1-1b-resp (aka. HPS): No change in performance detected.

       time:   [38.667 ms 38.825 ms 39.002 ms]
       thrpt:  [25.639   B/s 25.757   B/s 25.862   B/s]
change:
       time:   [-1.2984% -0.5996% +0.0926] (p = 0.10 > 0.05)
       thrpt:  [-0.0925% +0.6032% +1.3155]
       No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) high mild
7 (7.00%) high severe

transfer/1-conn/1-100mb-req (aka. Upload): 💔 Performance has regressed by +3.1347%.

       time:   [208.20 ms 208.61 ms 209.03 ms]
       thrpt:  [478.39 MiB/s 479.36 MiB/s 480.31 MiB/s]
change:
       time:   [+2.8432% +3.1347% +3.4175] (p = 0.00 < 0.05)
       thrpt:  [-3.3046% -3.0394% -2.7646]
       Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe

streams/walltime/1-streams/each-1000-bytes: Change within noise threshold.

       time:   [592.80 µs 594.60 µs 596.72 µs]
       change: [+0.0155% +0.6357% +1.2321] (p = 0.04 < 0.05)
       Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
7 (7.00%) high severe

streams/walltime/1000-streams/each-1-bytes: Change within noise threshold.

       time:   [12.232 ms 12.253 ms 12.275 ms]
       change: [-0.8030% -0.5649% -0.3139] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

streams/walltime/1000-streams/each-1000-bytes: 💚 Performance has improved by -1.4408%.

       time:   [44.793 ms 44.837 ms 44.882 ms]
       change: [-1.7250% -1.4408% -1.2257] (p = 0.00 < 0.05)
       Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

transfer/walltime/pacing-false/varying-seeds: Change within noise threshold.

       time:   [23.362 ms 23.378 ms 23.393 ms]
       change: [-0.8000% -0.6899% -0.5768] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild

transfer/walltime/pacing-true/varying-seeds: Change within noise threshold.

       time:   [23.755 ms 23.770 ms 23.785 ms]
       change: [+0.2200% +0.3117% +0.4038] (p = 0.00 < 0.05)
       Change within noise threshold.

transfer/walltime/pacing-false/same-seed: Change within noise threshold.

       time:   [23.770 ms 23.787 ms 23.804 ms]
       change: [+0.9409% +1.0425% +1.1427] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

transfer/walltime/pacing-true/same-seed: Change within noise threshold.

       time:   [24.048 ms 24.074 ms 24.109 ms]
       change: [+0.0495% +0.1853% +0.3413] (p = 0.01 < 0.05)
       Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe

Download data for profiler.firefox.com or download performance comparison data.

github-actions · 2026-05-04T18:55:30Z

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to main at faa1437.

neqo-pr as client	neqo-pr as server
neqo-pr vs. aioquic: BP BA neqo-pr vs. go-x-net: BP BA neqo-pr vs. haproxy: BP BA neqo-pr vs. kwik: 🚀~~L1 C1~~ BP BA neqo-pr vs. linuxquic: L1 BP BA CM neqo-pr vs. lsquic: L1 C1 BP BA CM neqo-pr vs. msquic: A L1 C1 BP BA neqo-pr vs. mvfst: A BP BA neqo-pr vs. neqo: A BP BA CM neqo-pr vs. nginx: BP BA neqo-pr vs. ngtcp2: E BP BA CM neqo-pr vs. picoquic: A BP BA CM neqo-pr vs. quic-go: A BP BA neqo-pr vs. quic-zig: BP BA CM neqo-pr vs. quiche: BP BA neqo-pr vs. quinn: BP BA neqo-pr vs. s2n-quic: BP BA CM neqo-pr vs. tquic: S BP BA neqo-pr vs. xquic: A 🚀C1 BP BA	aioquic vs. neqo-pr: 🚀L1 ⚠️C1 BP BA CM go-x-net vs. neqo-pr: BP BA CM kwik vs. neqo-pr: BP BA CM linuxquic vs. neqo-pr: run cancelled after 20 min lsquic vs. neqo-pr: ⚠️C1 BP BA CM msquic vs. neqo-pr: BP BA CM mvfst vs. neqo-pr: run cancelled after 20 min neqo vs. neqo-pr: A BP BA CM ngtcp2 vs. neqo-pr: E BP BA CM openssl vs. neqo-pr: LR M A BP BA CM quic-go vs. neqo-pr: BP BA CM quic-zig vs. neqo-pr: ⚠️DC quiche vs. neqo-pr: BP BA CM quinn vs. neqo-pr: V2 CM s2n-quic vs. neqo-pr: 🚀BP CM tquic vs. neqo-pr: CM xquic vs. neqo-pr: M CM

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr vs. aioquic: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
neqo-pr vs. go-x-net: H DC LR M B U A L2 C2 6
neqo-pr vs. haproxy: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
neqo-pr vs. kwik: H DC LR C20 M S R Z 3 B U A 🚀L1 L2 🚀C1 C2 6 V2
neqo-pr vs. linuxquic: H DC LR C20 M S R Z 3 B U E A L2 C1 C2 6 V2
neqo-pr vs. lsquic: H DC LR C20 M S R Z 3 B U E A L2 C2 6 V2
neqo-pr vs. msquic: H DC LR C20 M S R Z B U L2 C2 6 V2
neqo-pr vs. mvfst: H DC LR M R Z 3 B U L1 L2 C1 C2 6
neqo-pr vs. neqo: H DC LR C20 M S R Z 3 B U E L1 L2 C1 C2 6 V2
neqo-pr vs. nginx: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-pr vs. ngtcp2: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
neqo-pr vs. picoquic: H DC LR C20 M S R Z 3 B U E L1 L2 C1 C2 6 V2
neqo-pr vs. quic-go: H DC LR C20 M S R Z 3 B U L1 L2 C1 C2 6
neqo-pr vs. quic-zig: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-pr vs. quiche: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-pr vs. quinn: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6
neqo-pr vs. s2n-quic: H DC LR C20 M S R 3 B U E A L1 L2 C1 C2 6
neqo-pr vs. tquic: H DC LR C20 M R Z 3 B U A L1 L2 C1 C2 6
neqo-pr vs. xquic: H DC LR C20 M S R Z 3 B U L1 L2 🚀C1 C2 6

neqo-pr as server

aioquic vs. neqo-pr: H DC LR C20 M S R Z 3 B U A 🚀L1 L2 ⚠️C1 C2 6 V2
chrome vs. neqo-pr: 3
go-x-net vs. neqo-pr: H DC LR M B U A L2 C2 6
kwik vs. neqo-pr: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
lsquic vs. neqo-pr: H DC LR C20 M S R Z 3 B E A L1 L2 ⚠️C1 C2 6 V2
msquic vs. neqo-pr: H DC LR C20 M S R Z B U A L1 L2 C1 C2 6 V2
neqo vs. neqo-pr: H DC LR C20 M S R Z 3 B U E L1 L2 C1 C2 6 V2
ngtcp2 vs. neqo-pr: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
openssl vs. neqo-pr: H DC C20 S R 3 B L2 C2 6
picoquic vs. neqo-pr: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
quic-go vs. neqo-pr: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
quic-zig vs. neqo-pr: H ⚠️DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
quiche vs. neqo-pr: H DC LR M S R Z 3 B A L1 L2 C1 C2 6
quinn vs. neqo-pr: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 BP BA
s2n-quic vs. neqo-pr: H DC LR M S R 3 B E A L1 L2 C1 C2 6 🚀BP BA
tquic vs. neqo-pr: H DC LR M S R Z 3 B A L1 L2 C1 C2 6 BP BA
xquic vs. neqo-pr: H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6 BP BA

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr vs. aioquic: E CM
neqo-pr vs. go-x-net: C20 S R Z 3 E L1 C1 V2 CM
neqo-pr vs. haproxy: E CM
neqo-pr vs. kwik: E CM
neqo-pr vs. msquic: 3 E CM
neqo-pr vs. mvfst: C20 S E V2 CM
neqo-pr vs. nginx: E V2 CM
neqo-pr vs. quic-go: E V2 CM
neqo-pr vs. quiche: E V2 CM
neqo-pr vs. quinn: V2 CM
neqo-pr vs. s2n-quic: Z V2
neqo-pr vs. tquic: E V2 CM
neqo-pr vs. xquic: E V2 CM

neqo-pr as server

aioquic vs. neqo-pr: E
chrome vs. neqo-pr: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2 BP BA CM
go-x-net vs. neqo-pr: C20 S R Z 3 E L1 C1 V2
kwik vs. neqo-pr: E
lsquic vs. neqo-pr: U
msquic vs. neqo-pr: 3 E
openssl vs. neqo-pr: Z U E L1 C1 V2
quic-go vs. neqo-pr: E V2
quiche vs. neqo-pr: C20 U E V2
s2n-quic vs. neqo-pr: C20 Z U V2
tquic vs. neqo-pr: C20 U E V2
xquic vs. neqo-pr: E V2

github-actions · 2026-05-04T19:01:12Z

Client/server transfer results

Performance differences relative to 8c18132.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params)	Mean ± σ	Min	Max	MiB/s ± σ	Δ `baseline`	Δ `baseline`
neqo-msquic-cubic	150.6 ± 4.3	144.6	177.0	212.4 ± 7.4	💔 1.4	0.9%
neqo-neqo-cubic	89.8 ± 2.3	85.5	95.3	356.3 ± 13.9	💚 -0.8	-0.9%
neqo-neqo-cubic-nopacing	88.8 ± 2.6	84.6	95.9	360.2 ± 12.3	💚 -0.7	-0.8%
quiche-neqo-cubic	171.8 ± 3.3	165.0	181.0	186.3 ± 9.7	💔 2.6	1.6%

Table above only shows statistically significant changes. See all results below.

All results

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params)	Mean ± σ	Min	Max	MiB/s ± σ	Δ `baseline`	Δ `baseline`
google-google-nopacing	457.6 ± 1.6	454.8	464.3	69.9 ± 20.0
google-neqo-cubic	267.8 ± 2.9	264.5	287.4	119.5 ± 11.0	-0.4	-0.1%
msquic-msquic-nopacing	128.2 ± 21.9	110.5	202.7	249.6 ± 1.5
msquic-neqo-cubic	149.1 ± 26.4	118.8	330.0	214.6 ± 1.2	4.3	3.0%
neqo-google-cubic	769.5 ± 2.2	763.3	774.5	41.6 ± 14.5	-0.0	-0.0%
neqo-msquic-cubic	150.6 ± 4.3	144.6	177.0	212.4 ± 7.4	💔 1.4	0.9%
neqo-neqo-cubic	89.8 ± 2.3	85.5	95.3	356.3 ± 13.9	💚 -0.8	-0.9%
neqo-neqo-cubic-nopacing	88.8 ± 2.6	84.6	95.9	360.2 ± 12.3	💚 -0.7	-0.8%
neqo-neqo-newreno	90.6 ± 2.4	85.2	96.1	353.1 ± 13.3	-0.6	-0.6%
neqo-neqo-newreno-nopacing	89.6 ± 3.5	83.5	112.0	357.1 ± 9.1	-0.7	-0.8%
neqo-quiche-cubic	188.1 ± 2.3	183.9	195.4	170.1 ± 13.9	-0.4	-0.2%
neqo-s2n-cubic	215.1 ± 2.3	209.6	229.8	148.8 ± 13.9	-0.1	-0.1%
quiche-neqo-cubic	171.8 ± 3.3	165.0	181.0	186.3 ± 9.7	💔 2.6	1.6%
quiche-quiche-nopacing	138.9 ± 3.6	133.8	161.8	230.4 ± 8.9
s2n-neqo-cubic	213.8 ± 3.4	207.7	226.2	149.7 ± 9.4	-0.8	-0.4%
s2n-s2n-nopacing	292.5 ± 25.8	267.7	389.1	109.4 ± 1.2

Download data for profiler.firefox.com or download performance comparison data.

omansfeld · 2026-05-04T19:28:35Z

The usize overflows on 32-bit when doing widening multiplication with Self::SCALE in compute_sent and when calculating norm_diff are a real concern. My flight to Toronto is almost over, so I'll look into that with a clear head again in the coming days.

Probably the easiest solution is to just widen to u64 for intermediate results when multiplying with Self::SCALE.

omansfeld commented Apr 1, 2026

View reviewed changes

larseggert reviewed Apr 1, 2026

View reviewed changes

Comment thread neqo-transport/src/cc/search.rs Outdated

omansfeld force-pushed the search branch from d9e3fbe to fb276a4 Compare April 8, 2026 12:52

omansfeld force-pushed the search branch from fb276a4 to f1a35fe Compare April 21, 2026 16:37

omansfeld added 5 commits April 27, 2026 12:31

feat(cc): SEARCH

a9be717

refactor from float to int math

1bd5ffc

make bin_end an Option

d9b1921

clean-up code

e34aba2

add comments about divergences from SEARCH draft-09

1ff2b19

omansfeld force-pushed the search branch from f1a35fe to 1ff2b19 Compare April 27, 2026 12:14

omansfeld marked this pull request as ready for review April 27, 2026 12:32

omansfeld requested review from KershawChang, martinthomson and mxinden as code owners April 27, 2026 12:32

omansfeld requested review from Copilot and larseggert April 27, 2026 12:32

Copilot started reviewing on behalf of omansfeld April 27, 2026 12:33 View session

github-actions Bot reviewed Apr 27, 2026

View reviewed changes

Copilot AI reviewed Apr 27, 2026

View reviewed changes

address ai's review comments

ad26c86

github-actions Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread neqo-transport/src/cc/search.rs

github-actions Bot reviewed Apr 27, 2026

View reviewed changes

fix possible usize overflow in compute_sent

da83d4e

github-actions Bot reviewed Apr 28, 2026

View reviewed changes

add tests for SEARCH

a3ea200

github-actions Bot reviewed May 4, 2026

View reviewed changes

more test things

77ef539

- for byte/index calculations - also adds a helper for `prev_idx` calculation - fixes existing should_panic test

github-actions Bot reviewed May 4, 2026

View reviewed changes

	let norm_diff = diff * Self::SCALE / prev_sent;
	let norm_diff = diff / (prev_sent / Self::SCALE);

	let bins_last_rtt = (rtt_nanos / bin_nanos) as usize;
	let bins_last_rtt = usize::try_from(rtt_nanos / bin_nanos).unwrap_or(usize::MAX);

	let norm_diff = diff * Self::SCALE / prev_sent;
	let norm_diff = diff.saturating_mul(Self::SCALE) / prev_sent;

	let norm_diff = diff * Self::SCALE / prev_sent;
	let norm_diff = (diff as u64 * Self::SCALE as u64 / prev_sent as u64) as usize;

Conversation

omansfeld commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codspeed-hq Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Review: SEARCH slow-start implementation (draft-chung-ccwg-search-09)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Summary

Uh oh!

github-actions Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 4, 2026

omansfeld commented Mar 31, 2026 •

edited

Loading

codecov Bot commented Mar 31, 2026 •

edited

Loading

codspeed-hq Bot commented Apr 27, 2026 •

edited

Loading