diff --git a/webrtc/webrtc-direct.md b/webrtc/webrtc-direct.md index 3e7ddd11c..90631f591 100644 --- a/webrtc/webrtc-direct.md +++ b/webrtc/webrtc-direct.md @@ -2,14 +2,16 @@ | Lifecycle Stage | Maturity | Status | Latest Revision | |-----------------|---------------------------|--------|-----------------| -| 2A | Candidate Recommendation | Active | r1, 2023-04-12 | +| 2A | Candidate Recommendation | Active | r2, 2026-06-20 | Authors: [@mxinden] -Interest Group: [@marten-seemann] +Interest Group: [@marten-seemann], [@dozyio], [@lidel] [@marten-seemann]: https://github.com/marten-seemann [@mxinden]: https://github.com/mxinden/ +[@dozyio]: https://github.com/dozyio +[@lidel]: https://github.com/lidel ## Motivation @@ -34,6 +36,10 @@ The TLS certificate fingerprint in `/certhash` is a [multibase](https://github.com/multiformats/multibase) encoded [multihash](https://github.com/multiformats/multihash). +WebRTC Direct v1 and v2 share the same `/webrtc-direct` multiaddr. The server +picks the version from the ICE username fragment prefix, not from a separate +multiaddr protocol. + For compatibility implementations MUST support hash algorithm [`sha-256`](https://github.com/multiformats/multihash) and base encoding [`base64url`](https://github.com/multiformats/multibase). Implementations MAY @@ -42,7 +48,15 @@ connect to all other nodes. ## Connection Establishment -### Browser to public Server +There are two connection establishment flows, v1 and v2. New connections SHOULD +use v2: v1 relies on SDP munging that browsers are removing, so its +browser-to-server path will stop working in the near future (see [Browser to +public Server v2](#browser-to-public-server-v2-no-sdp-munging) for details). v1 +is documented here because existing peers still use it; servers SHOULD keep +accepting it while browsers still allow v1 and the ecosystem of web clients +migrates to v2. + +### Browser to public Server v1 (SDP munging) Scenario: Browser _A_ wants to connect to server node _B_ where _B_ is publicly reachable but _B_ does not have a TLS certificate trusted by _A_. @@ -70,11 +84,9 @@ reachable but _B_ does not have a TLS certificate trusted by _A_. 4. _A_ constructs _B_'s SDP answer locally based on _B_'s multiaddr. - _A_ generates a random string prefixed with "libp2p+webrtc+v1/". The prefix - allows us to use the ufrag as an upgrade mechanism to role out a new version - of the libp2p WebRTC protocol on a live network. While a hack, this might be - very useful in the future. _A_ sets the string as the username (_ufrag_ or _username fragment_) - and password on the SDP of the remote's answer. + _A_ generates a random string prefixed with `libp2p+webrtc+v1/` and sets it + as both the username fragment (`ice-ufrag`) and password (`ice-pwd`) in the + synthetic remote answer SDP. _A_ MUST set the `a=max-message-size:16384` SDP attribute. See reasoning [multiplexing] for rational. @@ -98,6 +110,12 @@ reachable but _B_ does not have a TLS certificate trusted by _A_. Firefox, Chrome) due to use-cases in the wild. See also https://bugs.chromium.org/p/chromium/issues/detail?id=823036 + Chromium is rolling out the [`WebRTC-NoSdpMangleUfrag`][nosdpmangle] field + trial, which blocks this ufrag munging and so breaks this step. + [Browser to public Server v2](#browser-to-public-server-v2-no-sdp-munging) + avoids munging altogether and is the most future-proof way to establish + browser-to-server connections; new deployments SHOULD prefer it. + 6. Once _A_ sets the SDP offer and answer, it will start sending STUN requests to _B_. _B_ reads the _ufrag_ from the incoming STUN request's _username_ field. _B_ then infers _A_'s SDP offer using the IP, port, and _ufrag_ of the @@ -109,8 +127,8 @@ reachable but _B_ does not have a TLS certificate trusted by _A_. 2. _B_ sets an arbitrary sha-256 digest as the remote fingerprint as it does not verify fingerprints at this point. - 3. _B_ sets the connection field (`c`) to the IP and port of the incoming - request `c=IN `. + 3. _B_ sets the connection field (`c`) to the IP of the incoming + request `c=IN `. 4. _B_ sets the `a=max-message-size:16384` SDP attribute. See reasoning [multiplexing] for rational. @@ -149,10 +167,107 @@ reachable but _B_ does not have a TLS certificate trusted by _A_. 9. The remote is authenticated via an additional Noise handshake. See [Connection Security section](#connection-security). +### Browser to public Server v2 (no SDP munging) + +v2 keeps the same no-signaling model and the same `/webrtc-direct` multiaddr as +v1, but it never munges the local SDP offer. It exists because browsers are +taking munging away: Chromium's [`WebRTC-NoSdpMangleUfrag`][nosdpmangle] field +trial blocks any change to `a=ice-ufrag` (and `a=ice-pwd`) between `createOffer` +and `setLocalDescription`, so _B_ can no longer predict _A_'s credentials the way +v1 step (5) relies on. See the WebRTC issue [411871813][webrtc-411871813], the +[discuss-webrtc notice][discuss-webrtc-psa], and the [`getStats` side +channel][webrtc-stats-789] that prompted it. Instead of munging, _A_ leaves its +local credentials alone and puts its own ICE password into the username fragment +of the synthetic remote answer, where _B_ reads it back from the STUN `USERNAME` +with no signaling channel. + +_B_ picks the version from `server_ufrag`, the part of the STUN `USERNAME` before +the first `:` (an ICE ufrag never contains `:`). The prefix `libp2p+webrtc+v2/` +selects this flow; `libp2p+webrtc+v1/` selects +[v1](#browser-to-public-server-v1-sdp-munging). The `libp2p+webrtc+` namespace is +reserved for versions: if _B_ does not recognize the version, or sees no such +prefix, it MUST reject the connection rather than guess a version, so _B_ never +mistakes a future version for an older one. A _B_ that supports both flows MUST +serve them on the same UDP port and route each incoming connection by this +prefix. + +1. _A_ and _B_ perform steps (1), (2), and (3) from + [Browser to public Server v1 (SDP munging)](#browser-to-public-server-v1-sdp-munging). + As in v1, _B_ is ICE _controlled_: it acts as an [ICE + Lite](https://www.rfc-editor.org/rfc/rfc8445) agent that answers connectivity + checks but never starts its own. + +2. _A_ creates a local offer via + [`RTCPeerConnection.createOffer()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/createOffer) + and sets it via + [`RTCPeerConnection.setLocalDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setLocalDescription). + _A_ MUST NOT modify ("munge") the local offer's `a=ice-ufrag` or `a=ice-pwd`; + they stay as the browser-generated random values. We call them `client_ufrag` + and `client_pwd` below. + +3. _A_ reads its local ICE password `client_pwd` from the `a=ice-pwd` line of + [`RTCPeerConnection.localDescription`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/localDescription) + after step (2), so it sees the value the browser actually chose. An ICE + password is 22 to 256 `ice-char`s (`ALPHA / DIGIT / "+" / "/"`, which never + includes `:`); see [RFC 8839], section 5.4 for the grammar and [RFC 8445], + section 5.3 for the randomness requirement. Browser-generated passwords meet + this. If _A_ cannot read a valid password back, it MUST abort the dial rather + than continue with an incomplete credential. + + _A_ builds _B_'s synthetic remote answer and sets both the ICE username + fragment and password to `libp2p+webrtc+v2/` followed by `client_pwd`: + - `a=ice-ufrag:libp2p+webrtc+v2/` + - `a=ice-pwd:libp2p+webrtc+v2/` + + The resulting `server_ufrag` MUST be a valid ICE username fragment (4 to 256 + `ice-char`s, [RFC 8839], section 5.4); the 17-character prefix counts toward + this limit, so it caps how long `client_pwd` can be. The answer is an ICE Lite + answer (`a=ice-lite`) with `a=setup:passive`, which makes _A_ the DTLS client + and _B_ the DTLS server. _A_ MUST set `a=max-message-size:16384` and take the + remote fingerprint from _B_'s `/certhash`. _A_ sets the remote answer via + [`RTCPeerConnection.setRemoteDescription()`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/setRemoteDescription). + +4. _A_ starts sending STUN connectivity checks. The browser signs each check's + `MESSAGE-INTEGRITY` with the ICE password it took from the synthetic answer + (`libp2p+webrtc+v2/`) and sets its `USERNAME` attribute to + `server_ufrag:client_ufrag` ([RFC 8445], section 7.2.2). _B_ splits the + `USERNAME` on the first `:` into `server_ufrag` and `client_ufrag`, and MUST + check that both are well-formed ICE username fragments (the `ice-char` set and + length bounds of [RFC 8839], section 5.4), rejecting the connection otherwise. + This keeps attacker-controlled bytes such as `CR`/`LF` out of the SDP _B_ + builds in step (5). The `libp2p+webrtc+v2/` prefix on `server_ufrag` selects + this flow (see version selection above). + +5. _B_ recovers `client_pwd` by stripping the `libp2p+webrtc+v2/` prefix from + `server_ufrag`. If the result is not a valid ICE password (22 to 256 + `ice-char`s, [RFC 8839], section 5.4), _B_ MUST reject the connection. _B_ + then infers _A_'s SDP offer as in v1 step (6), except the username fragment + and password now differ: + 1. `a=ice-ufrag` = `client_ufrag` + 2. `a=ice-pwd` = `client_pwd` (the value recovered above) + 3. an arbitrary sha-256 remote fingerprint (verification is disabled, see v1 + step (7)) + 4. `c=IN ` from the incoming STUN packet source + 5. `a=max-message-size:16384` + 6. `a=setup:actpass` (or `active`); _B_ answers as the DTLS server + +6. Before generating its answer, _B_ MUST set its own local ICE username fragment + and password to `server_ufrag` (the exact value _A_ embedded, + `libp2p+webrtc+v2/`). ICE needs this: _A_ signs its checks with + this value and expects _B_'s STUN responses to use it too. _B_ sets the + inferred offer from step (5) as the remote description, then generates its + answer and sets it as the local description. That answer is an ICE Lite answer + with `a=setup:passive` (so _B_ is the DTLS server) and carries the credentials + set above. + +7. _A_ and _B_ continue with steps (7), (8), and (9) from + [Browser to public Server v1 (SDP munging)](#browser-to-public-server-v1-sdp-munging). + +## Transport Support + WebRTC can run both on UDP and TCP. libp2p WebRTC implementations MUST support UDP and MAY support TCP. - ## Connection Security Note that the below uses the message framing described in @@ -252,8 +367,12 @@ prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83 it at this point in time. Later versions of the libp2p WebRTC protocol might adopt this optimization. - Note, one can role out a new version of the libp2p WebRTC protocol through a - new multiaddr protocol, e.g. `/webrtc-direct-2`. + Most new versions roll out within the same `/webrtc-direct` multiaddr by adding + a `libp2p+webrtc+/` ICE username fragment prefix, the way v2 was added + alongside v1 (see [Connection Establishment](#connection-establishment)). A new + multiaddr protocol (e.g. `/webrtc-direct-3`) is only needed for a wire change + the prefix cannot negotiate, for example one the dialer needs to know from the + multiaddr before it opens the connection. - _Why exchange fingerprints in an additional authentication handshake on top of an established WebRTC connection? Why not only exchange signatures of ones TLS @@ -310,3 +429,9 @@ prologue = "6c69627032702d7765627274632d6e6f6973653a12203e79af40d6059617a0d83b83 Given that WebRTC uses DTLS 1.2, _B_ is the one that can send data first. [multiplexing]: ./README.md#multiplexing +[RFC 8445]: https://www.rfc-editor.org/rfc/rfc8445 +[RFC 8839]: https://www.rfc-editor.org/rfc/rfc8839 +[nosdpmangle]: https://webrtc-review.googlesource.com/c/src/+/385721 +[webrtc-411871813]: https://issues.webrtc.org/issues/411871813 +[discuss-webrtc-psa]: https://groups.google.com/g/discuss-webrtc/c/PIJZN5MTZF4 +[webrtc-stats-789]: https://github.com/w3c/webrtc-stats/issues/789