Real-Time Backend

uWebSockets.js

15 million WebSocket messages per second on a single core versus 300 thousand with standard ws. A 50x difference - not a marketing slide but a benchmark on one CPU. uWebSockets.js achieves this not through a clever algorithm but by stepping outside JavaScript.

  • **Online games:** realtime multiplayer with 10000+ players requires broadcasting state 60 times per second. With ws that is ~18M messages/sec, impossible in a single process. uWS handles it through C++ pub/sub and cork.
  • **Fintech:** trading platforms push quotes to thousands of clients with sub-1ms latency. Zero-copy and minimal allocations reduce tail latency - exactly what matters for trading.
  • **Smart home:** IoT hub with thousands of devices. Postman uses uWS in its Postbot platform; Socket.IO at high load recommends uWS as a transport.

uWebSockets.js: when ws is not fast enough

uWebSockets.js creator Alex Hultqvist once published a benchmark: his library handles 15 million messages per second on a single core, while the standard `ws` tops out at 300 thousand. The 50x gap comes from architectural decisions: a C++ core (libuv + OpenSSL), zero-copy where possible, and elimination of unnecessary allocations. uWebSockets.js (uWS) wraps this C++ engine with a thin layer of Node.js bindings. It is not pure JavaScript - it is a native module that uses N-API. Consequence: binaries are platform-specific and require the correct glibc version (Ubuntu 24.04 vs Alpine).

uWS combines an HTTP and WebSocket server in one event loop, unlike Express + ws where HTTP and WS are separate layers with additional overhead. A single port accepts both HTTP upgrade requests and regular HTTP. This is critical for production: one process, one port, maximum resource utilization.

Why is uWebSockets.js significantly faster than pure-JavaScript WebSocket implementations?

Performance: zero-copy and memory model

In a uWS `message` handler, the `ArrayBuffer` passed is a direct pointer to the C++ core's internal buffer. The buffer is only alive during the handler call. If the data is needed later, an explicit copy must be made via `Buffer.from(message).slice()` or `message.slice(0)`. This is zero-copy: JS code reads the data without allocating new memory. Developers from the ws world often end up with an empty buffer because they store a reference to `message` without copying.

uWS supports pub/sub at the C++ level via a topic mechanism. `ws.subscribe('room:42')` and `app.publish('room:42', data)` work without O(n) iteration over subscribers in JS. The internal index is on the C++ side. This matters especially for fan-out to thousands of clients: broadcast in uWS is faster than broadcast via Set.forEach.

Why can't a reference to `message` in a uWS handler be stored for use in setTimeout?

Backpressure: managing overload

When a server sends data faster than the client can consume it, the TCP buffer grows. uWS does not hide this problem the way most libraries do - it explicitly signals backpressure. The `ws.send()` method returns a value: 0 (BACKPRESSURE - buffer full), 1 (SUCCESS - sent or buffered), 2 (DROPPED - connection closed). The `drain` handler fires when the buffer is freed. Ignoring backpressure leads to unbounded memory growth and OOM.

`ws.getBufferedAmount()` returns bytes queued for sending. If the value grows without stopping, the client is not reading fast enough. Strategies: throttle sending, close the connection when a threshold is exceeded, move the client to a slow-path queue. Many realtime games use the last approach: slow clients receive snapshots less frequently.

What does the return value 0 (BACKPRESSURE) from ws.send() mean in uWebSockets.js?

Pub/Sub and cork: kernel-level batching

uWS implements pub/sub at the C++ level without iterating over subscribers in JS. `app.publish('topic', data)` calls one C++ method that iterates a linked list of subscribers and writes to their sockets via writev() - one syscall per client. No JavaScript in the hot path. `ws.cork(callback)` is the application-level equivalent of Nagle's algorithm: it buffers all send() calls inside the callback and sends them as one TCP segment. Without cork, each send() is potentially a separate syscall.

Topics in uWS are hierarchical via wildcards: subscribe('room:#') subscribes to all topics of the form 'room:123', 'room:456'. publish('room:42', data) delivers only to those subscribed to the exact topic or a matching wildcard. This allows building broadcast hierarchies without an additional routing layer.

uWebSockets.js can be used as a drop-in replacement for the ws library without changing code

The uWS API differs significantly: message arrives as an ArrayBuffer (not Buffer/string), there is no EventEmitter interface, the backpressure API is mandatory

uWS deliberately chose an API different from ws to achieve maximum performance. It is not a wrapper around ws but an independent implementation with different trade-offs. Migration requires rewriting handlers.

Why use ws.cork() when sending several messages in a row?

Key ideas

  • **Zero-copy message**: `message` in the handler is an ArrayBuffer pointing to C++ memory, valid only inside the handler. For persistence, use `Buffer.from(message).slice()`.
  • **Backpressure API**: `ws.send()` returns 0/1/2. Value 0 (BACKPRESSURE) requires pausing sends until the `drain` event. Ignoring this leads to OOM.
  • **Pub/Sub and cork**: topics are handled by the C++ core without JS iteration. `ws.cork()` batches multiple send() calls into one syscall - critical at high send frequencies.

Related topics

uWebSockets.js is the high-load alternative to ws:

  • ws (Node.js) — The standard alternative with a simpler API and pure-JavaScript implementation. The choice between uWS and ws depends on throughput requirements
  • Backpressure and flow control — Backpressure in WebSocket is a specific case of the general flow control problem in realtime systems

Вопросы для размышления

  • uWS passes `message` as an ArrayBuffer pointing to C++ memory. What code patterns in production can lead to reading freed memory, and how can they be detected?
  • Pub/sub in uWS works without a JS layer but is limited to one process. How do you scale pub/sub across multiple Node.js processes, and what role does Redis play there?
  • cork() batches multiple send() calls into one syscall. In what scenarios can cork() hurt latency, even when it improves throughput?

Связанные уроки

  • rt-05 — uWebSockets.js is a bare-metal WebSocket server; protocol knowledge is mandatory
  • rt-09 — Socket.IO is the abstracted alternative; comparing them makes the uWebSockets.js tradeoffs concrete
  • rt-08 — uWebSockets.js is where binary serialization pays off - binary frames are first-class
  • rt-12 — ws vs uWebSockets.js: pure JS vs C++ binding - same goal, different performance ceiling
  • net-15-tcp-basics
uWebSockets.js

0

1

Sign In