Real-Time Backend
Design: Multiplayer Game
Fortnite, 2018: 12M players at the same time in Battle Royale. Each of them moves, shoots, builds, and sees a consistent picture of the world. How does this work without desync and cheats?
- Fortnite: 350M registered players, peak 12M concurrent. Epic built the authoritative server infrastructure on AWS with regional game server clusters
- Valve CS2: 128-tick servers mean the server takes a full world snapshot and sends deltas to 20+ players every 7.8 ms
- Riot Games LoL: matchmaking processes millions of requests at peak hours with SBMM based on a TrueSkill-like algorithm with an uncertainty factor
- Dead by Daylight: rollback netcode allows comfortable play even when player ping differs by up to 150 ms without the 'rubber banding' feel
Authoritative Server Architecture
Fortnite peaked at 12M concurrent players. One architectural bet sits behind that number: the **authoritative server model**. The server is the single source of truth about game state. The client sends only intentions (inputs) and receives results back. No trust in the client.
The alternative, peer-to-peer (P2P), was used in early RTS and fighting games. Each client computes physics locally and syncs deltas. The problem: cheats are trivial to implement by modifying the local client, and state desync is almost inevitable on flaky connections.
Valve Source Engine runs at 66 ticks per second on most servers, up to 128 ticks for CS2 in competitive mode. Tick rate is how often the server processes incoming commands and broadcasts state updates. 128 ticks = every 7.8 ms the server takes a snapshot of the world.
For League of Legends Riot Games picked a tick rate of 150 ms (about 6 to 7 ticks/s), deliberately low. A MOBA with an isometric view is more tolerant of latency than a first-person shooter. Riot saves server resources and lowers the network bar for players in high-ping regions.
- **Authoritative server**: the server runs physics, checks collisions, applies damage
- **Client-side prediction**: the client predicts the result of its own input locally to hide perceived latency
- **Server reconciliation**: the client corrects the prediction when the server's snapshot arrives
- **Lag compensation**: the server rewinds state to the moment of the shot to compensate for the player's network latency
Why is the client in the authoritative server model not allowed to apply damage to another player on its own?
Lobby System
A lobby is a waiting room before the match. The task looks simple: gather a group of players, let the leader pick settings, start the match. In practice it is a separate stateful microservice with non-trivial consistency requirements.
Steam handles millions of lobbies at once through its Lobby API. Each lobby stores a list of participants, metadata (map, mode, privacy), and status (waiting/ready/starting). The key choice is where to store this state.
- Redis (in-memory) — Low read/write latency (< 1 ms). Ideal for volatile lobby data, which lives 5 to 15 minutes. TTL automatically cleans up abandoned lobbies. Risk: on a Redis restart data is lost, so you need persistence or replication.
- PostgreSQL — ACID guarantees, no race conditions when two players join at the same time. More expensive on latency (5 to 20 ms), but more reliable for financial operations (paid lobbies, wagering). Overkill for free casual games.
Dead by Daylight uses lobbies of size 5 (1 killer + 4 survivors). Each lobby member stores its character selection and ready status. The match starts only when all 5 are ready AND matchmaking has found a suitable game server. If no server is found within 30 seconds, the lobby returns to the waiting state.
The critical lobby problem is **split-brain on disconnect**. The host lost the connection at the moment everyone was ready to launch. Two approaches: host migration (hand the host role to the next participant) or server-authoritative lobby (the server makes the launch decision regardless of the host).
Why is lobby data often stored in Redis rather than PostgreSQL?
Matchmaking
Matchmaking solves the problem: find a set of players who will get a balanced and fair game with minimal wait time. It is multi-criteria optimization. The criteria often conflict.
- **Skill (MMR/ELO)**: the rating gap must not be catastrophic
- **Latency**: every participant's ping to the game server has to be acceptable (< 100 ms)
- **Wait time**: the player should not wait longer than 2 to 3 minutes
- **Party integrity**: a group of friends should land in the same match on the same side
- **Region**: players from the same region are preferred
Riot Games has publicly described its LoL matchmaking: an ELO-like MMR (Match Making Rating) system. The system first looks for opponents in a narrow rating window. If nothing turns up in 30 seconds, it widens the window. After 2 minutes the window is at its maximum.
Fortnite with 350M registered players uses regional matchmaking pools. North America, Europe, Asia are separate queues with separate server regions. Cross-region matches are possible only during off-peak hours when the pool is too small.
For skill-based matchmaking (SBMM) MMR freshness is critical. After every match the rating is recomputed. Modern systems use a Bayesian approach (TrueSkill by Microsoft): the rating is stored as a distribution (mean + sigma), not as a single number. That gives a more accurate estimate for players with few games.
Why does matchmaking widen the MMR window over time instead of starting with a wide window right away?
Game State Sync
The game server is up, the match has started. Now the task is to sync the world state across every participant with minimal latency. The naive approach is to broadcast a full snapshot every tick. For Fortnite with 100 players that is a disaster: 100 * 100 bytes * 66 ticks = about 660 KB/s per client.
Valve Source Engine uses **delta compression**: the client only receives the state change relative to the last acknowledged snapshot. If a player's position did not change, it is not sent. This cuts traffic by 5 to 10 times in typical scenarios.
- **Full state snapshot**: send everything every tick. Simple but expensive on traffic
- **Delta updates**: only changes since the last ACK. Harder, but more efficient
- **Interest management**: send only what is within the player's view radius
- **Priority queue**: important events (hits, deaths) go first when the network is congested
**Rollback netcode** is a technique popularized by fighting games (Guilty Gear Strive, Street Fighter 6, Skullgirls). Instead of waiting for the server's confirmation, the client predicts the opponent's actions and renders them locally. If the prediction is wrong, it rolls state back and recomputes. Perceived latency disappears, but visual glitches can show up on a desync.
- Lockstep (delay-based) — Every participant waits for confirmation from every other participant before simulating the next tick. Deterministic outcome, no desync. The problem: one high-ping player slows everyone else down. Used in Starcraft, Age of Empires.
- Rollback netcode — Each client predicts the others' actions and simulates locally. When real data arrives, it rolls back and resimulates. Comfortable for players with different ping. Requires a fully deterministic and fast simulation. Used in fighting games and indie titles.
Client-side prediction means the client decides the outcome of actions and the server accepts that decision
Client-side prediction is only a local visualization of the assumed result. The server always validates and may correct it
If the server took client predictions as truth, any cheater could predict a teleport to any point on the map. Prediction exists purely to mask perceived latency in the UI.
What is lag compensation on the game server?
Takeaways
- **Authoritative server**: the client sends only inputs, the server runs everything and is the single source of truth. This is the only defense against cheats
- **Lobby**: a stateful waiting room in Redis with TTL. Holds participants, settings, ready statuses. Needs host migration on a leader disconnect
- **Matchmaking**: multi-criteria optimization (skill + latency + wait time). The MMR window expands dynamically with wait time
- **Game state sync**: delta compression cuts traffic by 5 to 10 times. Interest management filters out invisible entities. Lag compensation rewinds history for fair hit checks
Related topics
Multiplayer backend is built on the core concepts of realtime systems:
- WebSocket and realtime protocols — The game server uses persistent connections (WebSocket or UDP + QUIC) for low-latency state updates
- Distributed systems: consistency — The authoritative server is the single point of truth that resolves distributed consistency by centralization
- Load balancing and session affinity — Game sessions need sticky routing: every packet from a match must go to the same game server process
Вопросы для размышления
- Fortnite switched from P2P to an authoritative server as it scaled. What specific cheat and desync problems triggered that switch?
- With rollback netcode the client can see game objects jumping back when a prediction is corrected. How do modern games minimize that visual artifact?
- If a game server crashes mid-match, how do you organize a fast failover without losing the entire match's progress?