Fast UDP I/O for Firefox in Rust
Fast UDP I/O for Firefox in Rust
May 14, 2026 · By Max Inden
Around 20% of Firefox's HTTP traffic today uses HTTP/3, which runs over QUIC, which in turn runs over UDP. That is a lot of UDP packets flying around. And until recently, Firefox was handling all of them with APIs from the Netscape era.
Firefox uses NSPR for most of its network I/O. The N in NSPR stands for Netscape. When it comes to UDP, NSPR only offers PR_SendTo and PR_RecvFrom — thin wrappers around POSIX sendto and recvfrom. One datagram per system call. Every time. Operating systems have moved on. Linux has sendmmsg and recvmmsg. Some kernels and NICs support GSO (Generic Segmentation Offload) and GRO (Generic Receive Offload). Each of these can dramatically reduce the per-datagram overhead.
So Mozilla asked: can we replace this aging stack with something modern, memory-safe, and faster? The answer was yes. The result is a 4x throughput improvement in CPU-bound benchmarks.
(CPU-bound)
traffic share
to rollout
The Basics: One Datagram at a Time
Traditionally, Firefox sends and receives single UDP datagrams via sendto and recvfrom. The OS passes each one to the NIC, which puts it on the wire. Simple, but expensive at scale — every datagram pays the full user-to-kernel transition cost, regardless of size.
Batching: Many Datagrams, One Call
Modern OSes offer multi-message APIs. On Linux: sendmmsg and recvmmsg. Send a batch of datagrams in a single system call. The overhead that is independent of payload size — context switch, syscall entry/exit, socket lock — is paid once per batch, not once per packet.
Segmentation Offload: One Giant, Many Small
GSO and GRO go further. Instead of batching multiple datagrams, you send one large UDP datagram — larger than the MTU — to the kernel. The kernel (or ideally the NIC) segments it into properly-sized packets, adds headers, and calculates checksums. On receive, multiple incoming packets get coalesced back into one large datagram.
The Rust Rewrite
The project started mid-2024. Goal: rewrite Firefox's QUIC UDP I/O in Rust, using modern syscalls across all tier-1 platforms (Windows, macOS, Linux, Android). Rust was the obvious choice — Firefox's QUIC state machine is already in Rust, so integration is seamless, and you get memory safety for free.
Instead of building from scratch, Mozilla built on top of quinn-udp, the UDP I/O library from the Quinn project. This sped things up massively. OS calls are full of idiosyncrasies, especially across versions, and Firefox supports some ancient ones — Android 5 was still on the supported list when this started.
By mid-2025, the new stack was rolling out to most Firefox users. CPU flamegraphs showed the majority of time now spent in actual I/O syscalls and cryptography — not in framework overhead.
Platform by Platform
Linux
The smoothest rollout. Linux has the most mature UDP optimization stack: sendmmsg/recvmmsg for batching, plus GSO/GRO for segmentation offloading. quinn-udp prioritizes GSO over sendmmsg for transmission — GSO is faster, and combining both has diminishing returns. Firefox uses one UDP socket per connection for privacy (harder to correlate traffic), which means it cannot leverage sendmmsg's cross-4-tuple batching anyway. GSO is the clear winner here. The switch was largely uneventful.
Windows
Single-datagram WSASendMsg and WSARecvMsg work fine. But URO (coalesced receive) broke on ARM64 with WSL enabled — WSARecvMsg would not return a segment size, so Firefox could not tell where one QUIC packet ended and the next began. The reporter? A Mozilla employee. The fix? Max Inden bought the exact same laptop. URO remains disabled on Windows. USO (segmented send) also got rolled back after reports of increased packet loss and at least one network driver crash. More debugging needed. Microsoft has been notified.
macOS
Switching from sendto/recvfrom to sendmsg/recvmsg went smoothly. macOS does not support UDP segmentation offloading at all. It does have two undocumented syscalls — sendmsg_x and recvmsg_x — for batching. Lars from Mozilla added support behind a feature flag. After multiple bugfix iterations, Mozilla decided not to ship it. Undocumented APIs that Apple could remove at any time are not something you want in a browser used by hundreds of millions.
Android
Android is not Linux. x86 Android routes socket calls through socketcall instead of direct syscalls, and seccomp filters will crash you if you get it wrong. One single-line fix in quinn-udp fixed the dispatch. On API level 25 and below, setting ECN bits via sendmsg returns EINVAL — quinn-udp now retries with ECN disabled. The Quinn community also caught a bug where Android rejected GSO with a single segment. The upside of using a shared library: Firefox benefits from every fix upstream.
ECN: A Nice Bonus
Modern syscalls come with another perk: ancillary data. Firefox can now send and receive Explicit Congestion Notification (ECN) bits across all major platforms. Nightly telemetry shows ~50% of QUIC connections now run on ECN-capable paths. With L4S gaining traction, this is a solid forward-looking win.
Mozilla replaced a Netscape-era UDP stack with a modern Rust implementation in about a year. Throughput on CPU-bound QUIC benchmarks jumped from under 1 Gbit/s to 4 Gbit/s. The rewrite is now live for most Firefox users. Not every optimization landed on every platform — Windows URO/USO and macOS batching are still works in progress — but the foundation is solid, memory-safe, and built on a community library that keeps improving.
Get GoPeek
Open links on your current page without creating new tabs. Available on Edge and Firefox. Chrome support coming soon.
Comments
Post a Comment