fix: Handle ENOBUFS in UDP send#3482
Conversation
There was a problem hiding this comment.
Pull request overview
Updates neqo-udp’s UDP send error handling to avoid treating ENOBUFS as a fatal send error, aligning send behavior with transient NIC/stack buffer exhaustion on some platforms.
Changes:
- Treat
ENOBUFSfromUdpSocketState::try_send()asio::ErrorKind::WouldBlockinstead of a hard error. - Add
is_enobufs()helper (unix-gated) and update unit tests to account for the new predicate.
You can also share your feedback on Copilot code review. Take the survey.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3482 +/- ##
==========================================
- Coverage 94.30% 94.19% -0.11%
==========================================
Files 127 131 +4
Lines 38837 39178 +341
Branches 38837 39178 +341
==========================================
+ Hits 36626 36905 +279
- Misses 1376 1428 +52
- Partials 835 845 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
|
04de82b to
345f3cf
Compare
Based on a discussion with @ianswett yesterday, don't treat `ENOBUFS` as a fatal error when sending UDP packets.
|
No objections! |
|
@mxinden quinn-rs/quinn#2293 got closed - what now? |
|
@larseggert let's try to get to the root of the discussion. I prepared quinn-rs/quinn#2594. Can you help out? |
|
Claude finished @larseggert's task in 2m 3s —— View job Review of PR #3482
SummaryReviewed the ENOBUFS handling changes. The approach is correct — callers already handle
Also addressed previous copilot review comments (original error loss is acceptable given |
There was a problem hiding this comment.
Summary
Maps ENOBUFS (and WSAENOBUFS on Windows) to WouldBlock during UDP send, so the existing retry loops in both client and server correctly wait for writability and retry rather than treating it as fatal. Also cleans up is_emsgsize by switching from map_or to is_some_and, removing the clippy suppression.
The approach is sound — I verified that callers (neqo-bin/src/client/mod.rs:484, neqo-bin/src/server/mod.rs:325) handle WouldBlock with a writable().await + retry loop, which is exactly the right recovery for a transient send queue exhaustion. The try_io wrapper in neqo-bin/src/udp.rs:93 also correctly propagates the WouldBlock through tokio's readiness machinery.
Notes
- Comment/code scope mismatch: The inline comment says "On macOS/BSD" but the predicate fires on all unix + Windows. See inline suggestion.
- Windows test gap:
is_enobufson Windows (WSAENOBUFS) has no test coverage. See inline suggestion. - Upstream dedup: Agree with @mxinden that this could live in
quinn-udpto benefit all consumers. If quinn-rs/quinn#2594 lands, this local workaround can be removed. Worth tracking.
On previous review comments
- @copilot re: losing the original error (line 90): Agreed in principle, but callers only inspect
e.kind()and the original is captured byqdebug!. Not blocking. - @copilot re: missing
send_innerintegration test (line 91): Triggering ENOBUFS in a test environment is impractical without mocking the socket layer. The unit tests on the predicates are sufficient. - @copilot re: EAGAIN coverage dropped (line 384): This is addressed — the combined
is_emsgsize_and_is_enobufs_are_disjointtest at line 377 asserts both predicates returnfalseforEAGAIN.
| #[cfg(not(any(unix, windows)))] | ||
| return false; | ||
| }) | ||
| } |
There was a problem hiding this comment.
Note
Missing Windows test coverage for is_enobufs. The existing Windows tests cover is_emsgsize variants (WSAEMSGSIZE, WSAEINVAL, WSAEWOULDBLOCK) but don't verify WSAENOBUFS. Consider adding:
#[test]
#[cfg(windows)]
fn is_enobufs_true_for_wsaenobufs() {
let err = io::Error::from_raw_os_error(WinSock::WSAENOBUFS.0);
assert!(is_enobufs(&err));
assert!(!is_emsgsize(&err));
}
Reviewing PR #3482
|
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Lars Eggert <lars@eggert.org>
|
Claude finished @larseggert's task in 2m 44s —— View job Reviewing PR #3482
SummaryReviewed the ENOBUFS → WouldBlock mapping. The approach integrates correctly with tokio's readiness model via |
There was a problem hiding this comment.
Approach is sound. ENOBUFS → WouldBlock integrates correctly with the callers: both client/mod.rs:484 and server/mod.rs:325 retry in a writable().await loop, and critically, the try_io wrapper in neqo-bin/src/udp.rs:93 ensures tokio clears the socket's readiness flag on WouldBlock, preventing a busy-loop.
The is_emsgsize cleanup (is_some_and replacing map_or + clippy suppression) and the combined disjointness test are nice improvements.
One inline comment for missing Windows test coverage. Previous reviewer comments about upstream dedup in quinn-udp (quinn-rs/quinn#2594) are worth tracking — if that lands, this local workaround can be removed.
|
|
||
| #[test] |
There was a problem hiding this comment.
Note
Windows WSAENOBUFS coverage gap — the unix test is_emsgsize_and_is_enobufs_are_disjoint verifies both predicates for EMSGSIZE, ENOBUFS, and EAGAIN, but there's no Windows-side equivalent for WSAENOBUFS. The existing Windows tests only cover is_emsgsize variants.
| #[test] | |
| #[test] | |
| #[cfg(windows)] | |
| fn is_enobufs_true_for_wsaenobufs() { | |
| let err = io::Error::from_raw_os_error(WinSock::WSAENOBUFS.0); | |
| assert!(is_enobufs(&err)); | |
| assert!(!is_emsgsize(&err)); | |
| } | |
| #[test] |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
|
Benchmark resultsSignificant performance differences relative to 93f8004. transfer/1-conn/1-100mb-req (aka. Upload)/mtu-1504: 💚 Performance has improved by -1.4275%. time: [202.01 ms 202.39 ms 202.84 ms]
thrpt: [493.00 MiB/s 494.09 MiB/s 495.03 MiB/s]
change:
time: [-1.7257% -1.4275% -1.1276] (p = 0.00 < 0.05)
thrpt: [+1.1405% +1.4482% +1.7560]
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severeAll resultstransfer/1-conn/1-100mb-resp (aka. Download)/mtu-1504: Change within noise threshold. time: [200.06 ms 200.64 ms 201.31 ms]
thrpt: [496.74 MiB/s 498.41 MiB/s 499.84 MiB/s]
change:
time: [-1.3346% -0.9891% -0.6262] (p = 0.00 < 0.05)
thrpt: [+0.6302% +0.9990% +1.3526]
Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severetransfer/1-conn/10_000-parallel-1b-resp (aka. RPS)/mtu-1504: No change in performance detected. time: [285.70 ms 287.35 ms 289.01 ms]
thrpt: [34.601 Kelem/s 34.801 Kelem/s 35.001 Kelem/s]
change:
time: [-0.1259% +0.7315% +1.5788] (p = 0.10 > 0.05)
thrpt: [-1.5543% -0.7262% +0.1260]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mildtransfer/1-conn/1-1b-resp (aka. HPS)/mtu-1504: No change in performance detected. time: [38.789 ms 38.958 ms 39.144 ms]
thrpt: [25.547 B/s 25.669 B/s 25.780 B/s]
change:
time: [-0.4506% +0.2038% +0.8784] (p = 0.54 > 0.05)
thrpt: [-0.8707% -0.2034% +0.4527]
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) high mild
8 (8.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload)/mtu-1504: 💚 Performance has improved by -1.4275%. time: [202.01 ms 202.39 ms 202.84 ms]
thrpt: [493.00 MiB/s 494.09 MiB/s 495.03 MiB/s]
change:
time: [-1.7257% -1.4275% -1.1276] (p = 0.00 < 0.05)
thrpt: [+1.1405% +1.4482% +1.7560]
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severestreams/walltime/1-streams/each-1000-bytes: No change in performance detected. time: [593.75 µs 596.54 µs 600.12 µs]
change: [-0.5828% +0.0265% +0.7514] (p = 0.94 > 0.05)
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
8 (8.00%) high severestreams/walltime/1000-streams/each-1-bytes: No change in performance detected. time: [12.290 ms 12.307 ms 12.324 ms]
change: [-0.7874% -0.2680% +0.0874] (p = 0.28 > 0.05)
No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildstreams/walltime/1000-streams/each-1000-bytes: Change within noise threshold. time: [44.401 ms 44.494 ms 44.637 ms]
change: [-0.7751% -0.5245% -0.2330] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severetransfer/walltime/pacing-false/varying-seeds: Change within noise threshold. time: [81.114 ms 81.187 ms 81.259 ms]
change: [+1.7971% +1.9600% +2.0968] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low severe
1 (1.00%) high mildtransfer/walltime/pacing-true/varying-seeds: No change in performance detected. time: [81.256 ms 81.339 ms 81.445 ms]
change: [-0.1638% +0.1040% +0.3346] (p = 0.43 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-false/same-seed: Change within noise threshold. time: [81.462 ms 81.553 ms 81.681 ms]
change: [+2.0046% +2.2419% +2.4553] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/same-seed: No change in performance detected. time: [81.173 ms 81.249 ms 81.336 ms]
change: [-0.3887% -0.1776% -0.0128] (p = 0.07 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severeDownload data for |
Client/server transfer resultsPerformance differences relative to 93f8004. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Table above only shows statistically significant changes. See all results below. All resultsTransfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |

Based on a discussion with @ianswett yesterday, don't treat
ENOBUFSas a fatal error when sending UDP packets.