feat: Increase default UDP send buffer size to 1MB#3495
feat: Increase default UDP send buffer size to 1MB#3495larseggert wants to merge 1 commit intomozilla:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the neqo-bin UDP socket setup to proactively increase the default per-socket UDP send buffer size to 1MB (matching Firefox behavior), improving performance headroom for higher-throughput scenarios.
Changes:
- Set
SO_SNDBUFto 1MB when the existing send buffer is below 1MB. - Add debug logging to report when the send buffer is changed vs. left unchanged.
| qdebug!( | ||
| "Increasing socket send buffer size from {send_buf_before} to {ONE_MB}, now: {:?}", | ||
| state.send_buffer_size((&socket).into()) | ||
| ); |
There was a problem hiding this comment.
state.send_buffer_size((&socket).into()) returns a Result, but here it's logged with {:?} without ?, so the message will print Ok(..)/Err(..) and any error is silently ignored. Consider capturing send_buf_after using ? (or handling/logging the error explicitly) and logging the actual size to keep the log message accurate and error handling consistent with the earlier send_buf_before query.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3495 +/- ##
==========================================
- Coverage 94.30% 94.18% -0.12%
==========================================
Files 127 131 +4
Lines 38739 39069 +330
Branches 38739 39069 +330
==========================================
+ Hits 36532 36799 +267
- Misses 1369 1421 +52
- Partials 838 849 +11
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Benchmark resultsSignificant performance differences relative to 5b4e850. transfer/1-conn/1-100mb-resp (aka. Download)/mtu-1504: 💚 Performance has improved by -1.4682%. time: [197.87 ms 198.32 ms 198.78 ms]
thrpt: [503.07 MiB/s 504.23 MiB/s 505.38 MiB/s]
change:
time: [-1.8496% -1.4682% -1.1042] (p = 0.00 < 0.05)
thrpt: [+1.1165% +1.4901% +1.8845]
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/1-conn/1-100mb-req (aka. Upload)/mtu-1504: 💚 Performance has improved by -1.4988%. time: [201.27 ms 201.61 ms 201.95 ms]
thrpt: [495.16 MiB/s 496.01 MiB/s 496.85 MiB/s]
change:
time: [-1.8140% -1.4988% -1.2163] (p = 0.00 < 0.05)
thrpt: [+1.2313% +1.5216% +1.8475]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildAll resultstransfer/1-conn/1-100mb-resp (aka. Download)/mtu-1504: 💚 Performance has improved by -1.4682%. time: [197.87 ms 198.32 ms 198.78 ms]
thrpt: [503.07 MiB/s 504.23 MiB/s 505.38 MiB/s]
change:
time: [-1.8496% -1.4682% -1.1042] (p = 0.00 < 0.05)
thrpt: [+1.1165% +1.4901% +1.8845]
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/1-conn/10_000-parallel-1b-resp (aka. RPS)/mtu-1504: No change in performance detected. time: [284.31 ms 286.48 ms 288.64 ms]
thrpt: [34.645 Kelem/s 34.907 Kelem/s 35.173 Kelem/s]
change:
time: [-0.8882% +0.1486% +1.1667] (p = 0.78 > 0.05)
thrpt: [-1.1532% -0.1484% +0.8962]
No change in performance detected.transfer/1-conn/1-1b-resp (aka. HPS)/mtu-1504: No change in performance detected. time: [38.557 ms 38.689 ms 38.840 ms]
thrpt: [25.747 B/s 25.847 B/s 25.936 B/s]
change:
time: [-0.9644% -0.3229% +0.2969] (p = 0.31 > 0.05)
thrpt: [-0.2960% +0.3240% +0.9738]
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) high mild
4 (4.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload)/mtu-1504: 💚 Performance has improved by -1.4988%. time: [201.27 ms 201.61 ms 201.95 ms]
thrpt: [495.16 MiB/s 496.01 MiB/s 496.85 MiB/s]
change:
time: [-1.8140% -1.4988% -1.2163] (p = 0.00 < 0.05)
thrpt: [+1.2313% +1.5216% +1.8475]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildstreams/walltime/1-streams/each-1000-bytes: No change in performance detected. time: [585.24 µs 586.98 µs 589.06 µs]
change: [-0.4552% +0.0162% +0.5025] (p = 0.94 > 0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) high mild
6 (6.00%) high severestreams/walltime/1000-streams/each-1-bytes: No change in performance detected. time: [12.330 ms 12.348 ms 12.365 ms]
change: [-0.5767% -0.0159% +0.3522] (p = 0.96 > 0.05)
No change in performance detected.streams/walltime/1000-streams/each-1000-bytes: Change within noise threshold. time: [45.334 ms 45.379 ms 45.423 ms]
change: [+0.3568% +0.4952% +0.6280] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/walltime/pacing-false/varying-seeds: No change in performance detected. time: [77.788 ms 77.896 ms 78.049 ms]
change: [-0.3611% -0.1790% +0.0298] (p = 0.07 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severetransfer/walltime/pacing-true/varying-seeds: Change within noise threshold. time: [79.705 ms 79.873 ms 80.089 ms]
change: [+0.1325% +0.3644% +0.6441] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
4 (4.00%) high mild
5 (5.00%) high severetransfer/walltime/pacing-false/same-seed: Change within noise threshold. time: [78.072 ms 78.192 ms 78.348 ms]
change: [+0.0781% +0.3370% +0.5724] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
4 (4.00%) high mild
4 (4.00%) high severetransfer/walltime/pacing-true/same-seed: Change within noise threshold. time: [79.591 ms 79.695 ms 79.852 ms]
change: [-1.2790% -1.0413% -0.7934] (p = 0.00 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severeDownload data for |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
|
Client/server transfer resultsPerformance differences relative to 5b4e850. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Table above only shows statistically significant changes. See all results below. All resultsTransfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
|
Is there evidence that a larger send buffer improves performance? Intuitively, I can not tell whether it is a performance improvement, or leads to source buffer bloat. See past discussion in #2470 (comment). |
|
I also commented on Phabricator: https://phabricator.services.mozilla.com/D288864#10030860 |
|
Are you concerned 1MB is too much? The defaults are IIRC ~260KB on Linux and less on Mac, so I worry we may be buffer limited when uploading over reasonably fast paths. |
|
Yes, I am worried that 1 MB is too much. In which scenario do you see Firefox exceeding the 260KB OS buffer? An example:
Given that we pace all sends across 0.5 of the RTT, I don't see us benefiting from a larger OS buffer. With the theoretical benefits of a larger buffer come drawbacks, i.e. buffer bloat, which I assume are far more negative than the benefits. |
|
Careful, you almost make a good argument for a larger buffer. Lots of people are on gigabit links. I don't know how often we'd be able to service timers over 50ms, but I suspect it is more like 3 times than 10 in some cases (if timer granularity isn't increased beyond the typical). We definitely need to fix that pacing speedup. It's doing us more harm than good right now. |
See https://bugzilla.mozilla.org/show_bug.cgi?id=2024900