Backend Transport
WebSocket: Real-Time Bidirectional Communication
Slack delivers a message and it appears within 50 milliseconds. Not via 2-second polling, not via HTTP long-poll with a timeout - exactly 50 ms. That is WebSocket: one TCP connection held open for hours, through which data flows in both directions without re-establishment.
- **Discord** maintains more than 8 million concurrent WebSocket connections. The entire real-time layer - presence, typing indicators, voice events - flows through WS.
- **Figma** uses WebSocket to synchronize cursors and document changes between collaborators. Latency under 100 ms during concurrent editing.
- **Binance** and other exchanges stream the order book via WebSocket with updates every 100 ms - REST polling is physically impossible here.
WebSocket Handshake
WebSocket starts as a regular HTTP request with an `Upgrade: websocket` header. The server responds with `101 Switching Protocols` - from that moment the TCP connection becomes a bidirectional channel. HTTP is no longer used: data flows in both directions without additional handshakes.
`Sec-WebSocket-Accept` is a SHA-1 of the client key + a fixed GUID. This prevents a browser from accidentally connecting to a WebSocket server through a regular HTTP cache.
Which HTTP status does the server return on a successful WebSocket handshake?
WebSocket Frames and Protocol
After the handshake, data is transmitted in frames. Each frame has a header of 2 to 14 bytes and a payload. Key fields: `FIN` (last frame of a message), `opcode` (type: text=0x1, binary=0x2, ping=0x9, pong=0xA, close=0x8), `MASK` (clients always mask data with an XOR key).
Masking client frames (MASK=1) is mandatory per RFC 6455 - it protects against proxy servers that may cache HTTP responses and could be tricked by a specially crafted WebSocket frame.
Why does the client mask WebSocket frames with an XOR key?
Scaling WebSocket
WebSocket is a stateful connection. Each client is bound to a specific server. This breaks horizontal scaling: if 1M users are spread across 10 servers, a message for a user on server #3 cannot arrive through server #7.
A single Node.js server comfortably holds ~50K-100K idle WebSocket connections. With active traffic - fewer. uWebSockets.js is ~10x faster than the built-in `ws` module.
Why are sticky sessions (ip_hash in nginx) a non-ideal solution for WebSocket scaling?
Socket.IO vs Native WebSocket
Socket.IO is a library on top of WebSocket with automatic fallback to long-polling, rooms, namespaces, and auto-reconnect. The native WebSocket API is the browser standard without additional abstractions.
Socket.IO is NOT compatible with native WebSocket - a Socket.IO client cannot connect to a native WS server. With modern browsers, polling fallback is rarely needed: native WebSocket has been supported everywhere since 2012.
What happens when a native WebSocket client connects to a Socket.IO server?
WebSocket vs Server-Sent Events
Server-Sent Events (SSE) is a one-way event stream from server to client over regular HTTP. Simpler than WebSocket: no handshake, no framing, works through any HTTP/1.1 proxy. The browser automatically reconnects on disconnect.
HTTP/2 removes the browser limit of 6 parallel SSE connections (they are multiplexed). For chat apps - WebSocket; for server-push notifications without client replies - SSE is simpler.
WebSocket is always better than SSE because it is bidirectional
SSE is simpler for unidirectional streams: less code, auto-reconnect, works through any HTTP proxy without configuration
Protocol choice is determined by the communication pattern. If the client only listens, WebSocket adds complexity without benefit.
In which case is SSE preferable to WebSocket?
Key Ideas
- **Handshake** - WebSocket starts as an HTTP Upgrade (101), then TCP becomes a bidirectional channel without HTTP overhead.
- **Scaling** - stateful connections require Pub/Sub via Redis/NATS to synchronize across multiple servers.
- **SSE vs WS** - for server-only push, SSE is simpler; WebSocket is needed only when the client also sends data in real time.
Related Topics
WebSocket sits alongside other real-time protocols and scaling patterns:
- HTTP/2 and Multiplexing — HTTP/2 solves some HTTP/1.1 issues (multiple requests), but does not replace WebSocket for server-push scenarios
- Kafka and Async Brokers — WebSocket delivers events to clients in real time; Kafka is often the source of those events on the backend
Вопросы для размышления
- How would WebSocket architecture change if HTTP were stateful by default?
- Why did Discord choose Elixir rather than Node.js for its WebSocket server at 1M+ connections?
- In which scenarios is it worth starting with SSE and migrating to WebSocket only when necessary?