Skip to content

feat: Trigger migration based on UDP send errors#3485

Draft
larseggert wants to merge 1 commit intomozilla:mainfrom
larseggert:feat-migration
Draft

feat: Trigger migration based on UDP send errors#3485
larseggert wants to merge 1 commit intomozilla:mainfrom
larseggert:feat-migration

Conversation

@larseggert
Copy link
Copy Markdown
Collaborator

When sending indicates that the current path may not be working, trigger migration to a new path.

Copilot AI review requested due to automatic review settings March 17, 2026 08:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a mechanism for reacting to UDP send failures that indicate a path is no longer usable, by scheduling path re-probing and (on clients) immediately failing over to an alternate valid path when available.

Changes:

  • Add neqo_udp::is_network_error() to classify transient network-level send failures.
  • Add Connection::on_path_unavailable()/Paths::on_path_unavailable() and new path state handling to trigger re-probing and client fallback behavior.
  • Wire send-error handling into neqo-bin client and add transport-level migration tests covering fallback/re-probe/close behavior.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
neqo-udp/src/lib.rs Adds is_network_error() helper and tests to classify network send errors.
neqo-transport/src/path.rs Adds path/paths hooks (mark_failed_send, on_path_unavailable) to schedule probes and fallback to another path.
neqo-transport/src/connection/tests/migration.rs Adds new tests for send-error-triggered fallback, probe exhaustion, and server behavior.
neqo-transport/src/connection/mod.rs Exposes Connection::on_path_unavailable() to integrate send failures with path management and recovery.
neqo-http3/src/connection_client.rs Plumbs on_path_unavailable() through the HTTP/3 client wrapper.
neqo-bin/src/server/mod.rs Treats network send errors as non-fatal for the server runner send loop.
neqo-bin/src/client/mod.rs Detects network send errors and notifies the QUIC engine via on_path_unavailable().
neqo-bin/src/client/http3.rs Implements the new client trait method by forwarding to the HTTP/3 client.
neqo-bin/src/client/http09.rs Implements the new client trait method by forwarding to the transport connection.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread neqo-transport/src/path.rs
Comment thread neqo-udp/src/lib.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 93.61702% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.26%. Comparing base (07f7838) to head (59944ce).
⚠️ Report is 25 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3485      +/-   ##
==========================================
- Coverage   94.30%   94.26%   -0.04%     
==========================================
  Files         127      131       +4     
  Lines       38711    39252     +541     
  Branches    38711    39252     +541     
==========================================
+ Hits        36505    37000     +495     
- Misses       1365     1403      +38     
- Partials      841      849       +8     
Flag Coverage Δ
freebsd 93.24% <87.50%> (-0.07%) ⬇️
linux 94.32% <90.62%> (+0.04%) ⬆️
macos 94.24% <90.62%> (+0.05%) ⬆️
windows 94.30% <93.33%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
neqo-common 98.50% <ø> (+<0.01%) ⬆️
neqo-crypto 86.90% <ø> (ø)
neqo-http3 93.93% <0.00%> (+0.01%) ⬆️
neqo-qpack 95.02% <ø> (+0.21%) ⬆️
neqo-transport 95.30% <93.75%> (+0.06%) ⬆️
neqo-udp 86.05% <100.00%> (+3.07%) ⬆️
mtu 86.61% <ø> (ø)

@larseggert larseggert force-pushed the feat-migration branch 2 times, most recently from cf653ed to 21d43d7 Compare March 17, 2026 09:23
@mxinden
Copy link
Copy Markdown
Member

mxinden commented Mar 30, 2026

On a high level:

I assume the main motivation here is Firefox on Android switching between cellular and wifi, is that correct @larseggert?

In general, I am in favor of supporting QUIC path migration.

Before we merge here, I would like to better understand the larger plan. As far as I can tell there are multiple missing pieces to achieve the larger goal of path migration in Firefox.

  • This pull request, i.e. reacting to various UDP send errors as signals
  • Discovery of alternative paths. As far as I am aware today Neqo in Firefox would only ever have one path available. E.g. we never call neqo_transport::Connection::migrate in Firefox, thus Neqo never learns of a second client IP, thus never has more than one path. (Ignoring server preferred addresses for now.)
  • Some measure of success. Do we know this is a problem on Firefox today? How would we know things improved?

Before better understanding the larger project, I don't think merging here makes sense. It introduces complexity with no gain (today) in Firefox.

When sending indicates that the current path may not be working, trigger migration to a new path.
@larseggert
Copy link
Copy Markdown
Collaborator Author

So this is mostly plumbing, which as you say would require neqo-glue changes to take effect in Gecko. I was thinking we can experiment with the demo client and server here?

@mxinden
Copy link
Copy Markdown
Member

mxinden commented Mar 30, 2026

I was thinking we can experiment with the demo client and server here?

But then the first step would be providing new client addresses to Neqo, right? Otherwise there would never be any two paths to choose from in the first place.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark results

Significant performance differences relative to 64a36e2.

transfer/1-conn/1-100mb-resp (aka. Download)/mtu-1504: 💚 Performance has improved by -1.9440%.
       time:   [198.59 ms 199.05 ms 199.61 ms]
       thrpt:  [500.99 MiB/s 502.39 MiB/s 503.56 MiB/s]
change:
       time:   [-2.2712% -1.9440% -1.6026] (p = 0.00 < 0.05)
       thrpt:  [+1.6287% +1.9825% +2.3240]
       Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
All results
transfer/1-conn/1-100mb-resp (aka. Download)/mtu-1504: 💚 Performance has improved by -1.9440%.
       time:   [198.59 ms 199.05 ms 199.61 ms]
       thrpt:  [500.99 MiB/s 502.39 MiB/s 503.56 MiB/s]
change:
       time:   [-2.2712% -1.9440% -1.6026] (p = 0.00 < 0.05)
       thrpt:  [+1.6287% +1.9825% +2.3240]
       Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
transfer/1-conn/10_000-parallel-1b-resp (aka. RPS)/mtu-1504: Change within noise threshold.
       time:   [281.24 ms 283.18 ms 285.11 ms]
       thrpt:  [35.074 Kelem/s 35.314 Kelem/s 35.556 Kelem/s]
change:
       time:   [-2.0875% -1.1559% -0.1608] (p = 0.02 < 0.05)
       thrpt:  [+0.1611% +1.1694% +2.1320]
       Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
transfer/1-conn/1-1b-resp (aka. HPS)/mtu-1504: No change in performance detected.
       time:   [38.478 ms 38.636 ms 38.814 ms]
       thrpt:  [25.764   B/s 25.883   B/s 25.989   B/s]
change:
       time:   [-0.1172% +0.4624% +1.0079] (p = 0.12 > 0.05)
       thrpt:  [-0.9978% -0.4602% +0.1174]
       No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) high mild
6 (6.00%) high severe
transfer/1-conn/1-100mb-req (aka. Upload)/mtu-1504: Change within noise threshold.
       time:   [205.20 ms 205.55 ms 205.91 ms]
       thrpt:  [485.64 MiB/s 486.50 MiB/s 487.32 MiB/s]
change:
       time:   [+0.0127% +0.2523% +0.4882] (p = 0.05 < 0.05)
       thrpt:  [-0.4858% -0.2516% -0.0127]
       Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
streams/walltime/1-streams/each-1000-bytes: No change in performance detected.
       time:   [586.42 µs 588.29 µs 590.50 µs]
       change: [-0.4090% +0.0993% +0.5818] (p = 0.72 > 0.05)
       No change in performance detected.
Found 17 outliers among 100 measurements (17.00%)
11 (11.00%) high mild
6 (6.00%) high severe
streams/walltime/1000-streams/each-1-bytes: No change in performance detected.
       time:   [12.349 ms 12.368 ms 12.388 ms]
       change: [-0.2192% +0.0221% +0.2441] (p = 0.86 > 0.05)
       No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
streams/walltime/1000-streams/each-1000-bytes: Change within noise threshold.
       time:   [44.878 ms 44.963 ms 45.082 ms]
       change: [+0.5644% +0.7884% +1.1189] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-false/varying-seeds: No change in performance detected.
       time:   [80.846 ms 80.909 ms 80.979 ms]
       change: [-0.2376% -0.0333% +0.1200] (p = 0.75 > 0.05)
       No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
transfer/walltime/pacing-true/varying-seeds: Change within noise threshold.
       time:   [82.336 ms 82.428 ms 82.542 ms]
       change: [+0.5549% +0.7720% +0.9516] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe
transfer/walltime/pacing-false/same-seed: No change in performance detected.
       time:   [80.732 ms 80.861 ms 81.018 ms]
       change: [-0.2691% +0.0290% +0.3154] (p = 0.84 > 0.05)
       No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) high mild
7 (7.00%) high severe
transfer/walltime/pacing-true/same-seed: Change within noise threshold.
       time:   [81.896 ms 81.977 ms 82.074 ms]
       change: [-0.8414% -0.6985% -0.5529] (p = 0.00 < 0.05)
       Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe

Download data for profiler.firefox.com or download performance comparison data.

@github-actions
Copy link
Copy Markdown
Contributor

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to main at 64a36e2.

neqo-pr as clientneqo-pr as server
neqo-pr vs. go-x-net: BP BA
neqo-pr vs. haproxy: BP BA
neqo-pr vs. kwik: S L1 C1 BP BA
neqo-pr vs. linuxquic: run cancelled after 20 min
neqo-pr vs. lsquic: run cancelled after 20 min
neqo-pr vs. msquic: A L1 C1
neqo-pr vs. mvfst: H DC LR M R Z 3 B U A L1 L2 C1 C2 6 BP BA
neqo-pr vs. neqo: Z A
neqo-pr vs. nginx: BP BA
neqo-pr vs. ngtcp2: CM
neqo-pr vs. picoquic: A ⚠️BP
neqo-pr vs. quic-go: A
neqo-pr vs. quiche: BP BA
neqo-pr vs. s2n-quic: 🚀BA CM
neqo-pr vs. tquic: S BP BA
neqo-pr vs. xquic: A L1 🚀C1
aioquic vs. neqo-pr: Z CM
go-x-net vs. neqo-pr: CM
kwik vs. neqo-pr: Z BP BA CM
lsquic vs. neqo-pr: Z ⚠️C1
msquic vs. neqo-pr: Z 🚀BA ⚠️BP CM
mvfst vs. neqo-pr: Z A L1 C1 CM
neqo vs. neqo-pr: Z A
openssl vs. neqo-pr: LR M A 🚀BA CM
picoquic vs. neqo-pr: Z
quic-go vs. neqo-pr: ⚠️BA CM
quiche vs. neqo-pr: Z CM
quinn vs. neqo-pr: Z 🚀L1 V2 CM
s2n-quic vs. neqo-pr: ⚠️BP BA CM
tquic vs. neqo-pr: Z CM
xquic vs. neqo-pr: M CM
All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-pr as client

neqo-pr as server

@github-actions
Copy link
Copy Markdown
Contributor

Client/server transfer results

Performance differences relative to 64a36e2.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ baseline Δ baseline
neqo-neqo-cubic 94.4 ± 3.8 87.8 103.6 339.1 ± 8.4 💚 -2.7 -2.8%
neqo-neqo-newreno 96.6 ± 4.1 88.4 105.3 331.2 ± 7.8 💔 1.3 1.4%
neqo-neqo-newreno-nopacing 97.0 ± 4.2 87.7 105.0 330.0 ± 7.6 💔 1.9 2.0%
neqo-quiche-cubic 192.8 ± 4.9 185.7 205.9 165.9 ± 6.5 💔 1.5 0.8%

Table above only shows statistically significant changes. See all results below.

All results

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ baseline Δ baseline
google-google-nopacing 448.3 ± 4.3 426.9 461.4 71.4 ± 7.4
google-neqo-cubic 272.0 ± 4.5 262.0 287.2 117.6 ± 7.1 1.2 0.5%
msquic-msquic-nopacing 185.0 ± 90.8 117.6 605.7 173.0 ± 0.4
msquic-neqo-cubic 200.0 ± 78.0 134.3 422.2 160.0 ± 0.4 -12.6 -5.9%
neqo-google-cubic 747.6 ± 8.4 678.8 761.2 42.8 ± 3.8 0.2 0.0%
neqo-msquic-cubic 153.7 ± 4.7 147.6 163.6 208.2 ± 6.8 0.2 0.1%
neqo-neqo-cubic 94.4 ± 3.8 87.8 103.6 339.1 ± 8.4 💚 -2.7 -2.8%
neqo-neqo-cubic-nopacing 95.9 ± 4.1 86.9 104.9 333.8 ± 7.8 0.5 0.5%
neqo-neqo-newreno 96.6 ± 4.1 88.4 105.3 331.2 ± 7.8 💔 1.3 1.4%
neqo-neqo-newreno-nopacing 97.0 ± 4.2 87.7 105.0 330.0 ± 7.6 💔 1.9 2.0%
neqo-quiche-cubic 192.8 ± 4.9 185.7 205.9 165.9 ± 6.5 💔 1.5 0.8%
neqo-s2n-cubic 218.2 ± 4.9 210.7 242.1 146.6 ± 6.5 -0.4 -0.2%
quiche-neqo-cubic 179.7 ± 5.0 169.6 192.3 178.1 ± 6.4 -0.7 -0.4%
quiche-quiche-nopacing 142.9 ± 5.0 136.4 155.5 223.9 ± 6.4
s2n-neqo-cubic 220.1 ± 4.5 209.7 229.8 145.4 ± 7.1 0.7 0.3%
s2n-s2n-nopacing 295.8 ± 24.1 281.7 394.0 108.2 ± 1.3

Download data for profiler.firefox.com or download performance comparison data.

@larseggert
Copy link
Copy Markdown
Collaborator Author

Yes, this needs more pieces before being useful. Would you prefer to expand the PR before landing it?

@mxinden
Copy link
Copy Markdown
Member

mxinden commented Mar 30, 2026

Yes, this needs more pieces before being useful. Would you prefer to expand the PR before landing it?

Yes. I don't think at this point in time there is value in merging this pull request, only the risk of dead code on in main branch in case we never get around to adding the other missing pieces.

@larseggert larseggert marked this pull request as draft March 30, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants