Real-Time Backend

What Is Real-Time

In 2013, Facebook faced a problem: the mobile app queried the server every 5 seconds asking 'any new messages?'. With 500 million users, that was 6 billion pointless requests per minute. Batteries drained, servers burned. The solution - MQTT push protocol - cut traffic by 40× and made Messenger truly instant.

  • Telegram, WhatsApp - instant message delivery via WebSocket and push
  • Google Docs, Figma - simultaneous document editing by dozens of people
  • Uber, Yandex Taxi - live driver position updated every 2 seconds
  • Fortnite, CS2 - 128 tick rate, world state updated 128 times per second
  • Bloomberg Terminal - stock quotes updated in real-time worldwide

From Comet to WebSocket

Before 2011, real-time on the web was a hack. The Comet technique (2006) used a hidden iframe with an infinite server response - the browser rendered incoming <script> tags as they arrived. Another hack - Flash Socket - required a plugin. In 2011, RFC 6455 standardized WebSocket, giving the web a proper bidirectional channel. This changed backend architecture forever.

WebSocket transformed the browser from a 'requesting client' into a full participant in real-time communication

What 'real-time' means for the backend

A message is being typed in Telegram. The contact sees 'typing...' - instantly. The message is sent and read half a second later. That is **real-time**: the server delivers data to the client the moment it appears, rather than waiting for the client to ask.

In classical HTTP the client is always the initiator: send a request - get a response. The server **cannot** reach out to the client on its own. It is like a mailbox: a letter appears only when checked. Real-time flips this model - the server **pushes** data to the client.

  • Request-Response (classical HTTP) — Client asks → server answers. Client doesn't ask - no data. Latency = interval between requests.
  • Real-Time (push model) — Server sends data the moment it becomes available. Client is subscribed and waits. Latency = network + processing.

Users **don't think** about protocols. They think in terms of expectations: I type text - my contact sees it immediately. I liked a post - the counter updates. I moved on the map - the courier sees my position. When latency exceeds expectations, the product feels 'broken'.

User actionExpectationIf slower
Typing a message< 100 ms indicatorIndicator flickers, annoying
Message sent< 300 ms deliveryFeels like it froze
Liked a post< 500 ms updateTaps again
Moving on map< 1 s positionCourier 'jumps'
Received notification< 3 s after eventMissed something important

**Real-time does not mean instant.** It means 'fast enough for the specific use case'. Chat tolerates 200 ms; stock trading does not.

Real-time means zero latency

Real-time means latency below the perception threshold for a specific use case

The physics of networking does not allow zero latency. 'Real-time' is when the user does not notice the delay. For chat that is 200 ms, for games 50 ms, for trading microseconds.

What is the key difference between real-time and classical HTTP?

Polling vs Push: two data delivery models

Consider waiting for a parcel. **Polling** - every 5 minutes someone walks to the door to check. **Push** - the courier rings the bell directly. The efficiency difference is obvious.

**Long Polling** is a compromise. The client sends a request, but the server **does not reply immediately** - it holds the connection open until data appears (or a timeout expires).

ApproachLatencyServer loadWhen to use
Short Polling0..interval (avg = interval/2)High (empty requests)Infrequent updates, simple API
Long Polling~network latencyMedium (open connections)When WebSocket is unavailable
WebSocket~network latencyLow (one connection)Chat, games, live data
SSE (Server-Sent Events)~network latencyLow (unidirectional)Notifications, feeds, dashboards

**Rule of thumb:** if data updates more than once every 10 seconds - polling will not cut it. Use push.

Long polling solves all the problems of polling

Long polling is a compromise that introduces its own problems: every client holds an open HTTP connection

With 50,000 clients, long polling creates 50,000 hanging HTTP connections. This consumes memory and file descriptors on the server. WebSocket uses a lighter protocol after the handshake.

A chat app with 10,000 users uses short polling every 2 seconds. How many HTTP requests per minute does the server receive?

Latency Budget: allocating allowed delay

When a user sends a chat message, it travels a path: client → network → server (processing) → network → recipient. Each hop adds latency. A **latency budget** breaks the allowed end-to-end delay down by component.

Use CaseAllowed latencyServer budgetNetwork budget
Typing indicator< 100 ms10 ms (relay only)40 ms (2 hops)
Chat message< 200 ms40 ms (validate + store)40 ms
Notification< 2 s500 ms (generate + route)200 ms
Live dashboard< 1 s200 ms (aggregate)100 ms
Multiplayer game< 50 ms10 ms (game loop tick)20 ms
HFT trading< 1 ms0.1 ms0.5 ms (colocation)

A latency budget is a design tool. It helps identify **where to optimize**. If the network consumes 80% of the budget, optimizing server code is pointless - the architecture itself must be reconsidered (CDN, edge computing, colocation).

  1. **Define the allowed latency** from product requirements and UX research
  2. **Break it down by component:** client → network → server → network → client
  3. **Measure actual values** for each component (do not guess!)
  4. **Find the bottleneck** - the component that consumes the most of the budget
  5. **Optimize the bottleneck** or reconsider the architecture

**P99, not average!** An average latency of 50 ms sounds great. But if 1% of requests take 5 seconds, every hundredth user suffers. With 1 million users that is 10,000 people. Always design the budget around P99 (99th percentile).

Why Discord switched from Go to Rust

The Go garbage collector caused 1-10 ms pauses every few seconds. For chat - acceptable. For voice - audible. Discord rewrote critical services in Rust (no GC) and achieved a stable P99 < 1 ms. The latency budget drove the language choice.

The allowed latency for chat is 200 ms. Network (RTT) takes 80 ms, client rendering takes 20 ms. How much is left for the server?

Map of real-time use cases

Real-time is not a single technology. It is a spectrum of tasks with different requirements for latency, reliability, and scale. Understanding the map of use cases is what guides technology selection for each task.

Use CaseLatencyDirectionTechnologyExamples
Messaging / Chat< 200 msBidirectionalWebSocketTelegram, Slack, WhatsApp
Typing indicators< 100 msBidirectionalWebSocket'Typing...' in messengers
Notifications< 3 sServer → ClientSSE / Push APILikes, comments, alerts
Live dashboard< 1 sServer → ClientSSE / WebSocketGrafana, trading terminals
Collaborative editing< 100 msBidirectionalWebSocket + CRDT/OTGoogle Docs, Figma
Multiplayer games< 50 msBidirectionalWebSocket / UDPFortnite, CS2
Live location< 2 sBidirectionalWebSocketUber, Yandex Taxi
Stock trading< 1 msBidirectionalCustom TCP / FPGANASDAQ, exchanges
Live streaming< 5 sServer → ClientWebRTC / HLSTwitch, YouTube Live

Notice the **Direction** column. Not all use cases require a bidirectional channel. Notifications and dashboards are **server → client** only. For these, SSE is simpler and sufficient. WebSocket is needed only when the client actively sends data too.

Each use case imposes different requirements not only on latency but also on **delivery guarantees**:

  • At-most-once — Typing indicators, cursor positions. Losing one event is imperceptible - the next update will correct it.
  • At-least-once — Notifications, feed events. Better to show a duplicate than to lose an important notification.
  • Exactly-once — Payments, chat messages. Duplicates are a problem (double charge). Requires idempotency and acknowledgements.

**Start with the simplest solution.** SSE covers 80% of tasks (notifications, feeds, dashboards). WebSocket covers the remaining 20% (chat, games, collaborative editing). Custom UDP - isolated cases (HFT, FPS shooters).

For a live dashboard that updates charts once per second, the best fit is:

Key Lesson Ideas

  • Real-time - the server pushes data to the client without waiting for a request
  • Short polling - simple but wasteful: 99% of requests return nothing
  • Long polling - a compromise, but every client holds a connection open
  • WebSocket - full-duplex channel, the standard for bidirectional real-time communication
  • SSE - a simple unidirectional stream, ideal for notifications and dashboards
  • Latency budget - breaking the allowed delay into components (network, server, client)
  • Always design around P99, not the average - the tail of the distribution kills UX

What's next

This lesson covered why real-time is needed and what problems it solves. The next step is to explore the specific protocols and how they work internally.

  • WebSocket protocol — The main bidirectional real-time protocol
  • Server-Sent Events — Unidirectional server push
  • Pub/Sub pattern — Scaling real-time across multiple servers

Вопросы для размышления

  • What real-time features do popular apps (messengers, live dashboards, collaborative editors) have? What technology is each likely using?
  • When adding real-time notifications to an existing REST API, which approach fits best and why?
  • How does the latency budget change when users are distributed across multiple continents?

Связанные уроки

  • rt-02-http-limits — HTTP limitations discussed next explain why real-time architectures were invented
  • bt-01-overview — Real-time protocols are a specialization of the transport overview covered in backend-transport
  • st-01-feedback-loops — WebSocket creates a closed feedback loop; polling is an open loop with high latency
  • alg-01-big-o — Push O(1) vs polling O(n) is a direct application of complexity analysis to protocol choice
  • sd-01-intro — Real-time requirements appear in System Design estimation: QPS, connection counts, fan-out
  • net-21-http-basics
  • net-63-realtime-compare
What Is Real-Time

0

1

Sign In