Real-Time Backend
Design: Google Docs
50 people edit one document at the same time. Everyone sees the others' changes in real time. No conflicts, no 'somebody saved over my work'. How does this work under the hood, and why is it one of the hardest problems in distributed systems?
- Google Docs handles billions of operations a day, every keystroke becomes a transaction in a distributed system
- Figma moved from OT to CRDT in 2021, gaining offline-first editing and the ability to work without a central arbiter server
- Notion, Linear, Coda, Quip are all built on similar principles: operation log + WebSocket + ephemeral presence
- Microsoft Word Online and Apple iWork use a similar architecture, collaborative editing has become a commodity feature, but the infrastructure behind it is far from trivial
Google Docs architecture
In 2006 Google bought the startup Writely for USD 1M and rewrote it from scratch. The result is a system where 50+ users edit a single document at the same time, and conflicts are not merely resolved, they are prevented at the protocol level.
The core of the architecture is **Operational Transformation (OT)**. Every keystroke becomes an operation: `insert(pos, char)` or `delete(pos)`. The client applies the operation locally right away (optimistic update), then sends it to the server. The server serializes all incoming operations into a global log and returns transformed versions to the other clients.
The service splits into three components. **Docs Frontend** is stateless Node.js handling only HTTP/WebSocket. **Docs Backend** is stateful Go services that hold the in-memory document state and the operation queue. **Storage** is Bigtable for the current state plus Cloud Spanner for metadata and access rights.
Google uses a **channel-based model**: every open document is a channel with a unique ID. The WebSocket connection is bound to a channel. If the server crashes, the client reconnects, fetches a diff from the last revision, and resumes work without losing data.
Client A inserts 'X' at position 3, client B inserts 'Y' at position 3 at the same time. Both work against revision 5. What does the OT server do?
CRDT vs OT: evolution of the approach
OT works, but it has a sore spot: the centralized server must serialize every operation. On a network break the client cannot keep working autonomously, the versions will diverge. That is why **CRDTs (Conflict-free Replicated Data Types)** appeared.
A CRDT mathematically guarantees that any two nodes receiving the same set of operations in any order will reach the same result. Figma moved from OT to CRDT in 2021, which enabled offline-first editing and peer-to-peer sync without a central arbiter.
For text documents people use **Logoot** or **LSEQ**, algorithms based on fractional indexing. Each character receives a unique fractional ID between its neighbors. Inserting 'X' between characters with IDs 0.5 and 0.6 gets ID 0.55, and it never conflicts with a parallel insertion of 'Y' with ID 0.57.
- OT (Google Docs, 2006) — Requires a central arbiter server. Operations are transformed pairwise. Complex to implement (Jupiter algorithm). No offline: without the server conflicts cannot be resolved.
- CRDT (Figma, Notion, Linear) — No central arbiter. Operations are mathematically commutative. Simpler merge implementation. Full offline-first. Downside: tombstones bloat the document.
Notion has used a CRDT for its block editor since 2022. Linear uses CRDTs for real-time issue sync. Automerge and Yjs are popular open-source CRDT libraries that editors are built on. Prosemirror + Yjs is the default stack for collaborative editing.
A user works in Google Docs while offline. Operations pile up locally. When the connection returns, which approach handles the accumulated changes correctly?
Cursors and Presence
Colored cursors of other users are one of the most recognizable features of Google Docs. Behind them sits a separate ephemeral channel, completely independent of the document sync channel.
Presence data (cursor position, selection, user name) is **ephemeral state**: not stored in the database, not part of the revision log, no consistency required. It travels over a separate WebSocket channel with a 3 to 5 second TTL. There is no heartbeat: the cursor disappears automatically.
Problem: when another user inserts text above the cursor, the cursor position drifts. It has to be transformed the same way as OT operations. In a CRDT this is easier: the cursor is bound to a character by ID, not to an absolute position.
Figma solved presence by binding to objects: every object on the canvas has a UUID, the cursor binds to UUID + local offset. Moving other objects does not push other people's cursors around. Notion uses a similar scheme for block-level cursors: the cursor points to a blockId, not to a character position.
- **Throttling**: presence updates no more often than every 50 ms (20 fps) to avoid flooding WebSocket traffic during fast typing
- **Color assignment**: deterministic hash of userId to HSL color, so the same user always appears in the same color
- **Away detection**: 30 seconds with no activity and the cursor goes semi-transparent, 5 minutes and it disappears
- **Reconnect**: on reconnection the client immediately sends a presence update, it does not wait for the next input
User B sees the document: cursor A is at position 100. User A inserts 10 characters at position 50. What happens to A's cursor position on B's screen?
Version History and Permissions
Google Docs version history stores not snapshots of the document but an **operation log**, a chain of every OT operation with its timestamp and authorId. Restoring a version means replaying the operations up to the chosen moment. This saves storage: an operation log is 10 to 100 times smaller than a set of snapshots.
But keeping every keystroke forever is wasteful. Google applies **compaction**: operations from a single session (continuous work without long pauses) are squashed into one 'named version'. Named versions (the ones the user saved by hand) are never compacted.
The permissions system is built around **three tiers**: Owner, Editor, Commenter, Viewer. Rights live in Cloud Spanner (strong consistency matters here so that a revocation takes effect immediately in every region). When rights change, the server drops the WebSocket connection for downgraded users.
Permissions are checked twice: at WebSocket upgrade (coarse check) and on every operation (exact check). This protects against a race condition: a user opens the document with editor rights, the rights are revoked 5 seconds later, and the next operation is rejected without re-checking the session opening.
| Component | Storage | Reason |
|---|---|---|
| Current document | Bigtable | High throughput, low latency |
| Operation log | Bigtable | Append-only, range scan by revisionId |
| Metadata / permissions | Cloud Spanner | Strong consistency, ACID transactions |
| Presence / cursors | Redis Pub/Sub | Ephemeral, no persistence needed |
| Attachments / images | Google Cloud Storage | Blob storage, CDN |
Google Docs stores a snapshot of the document on every change, which is where the version history comes from
Version history is built on the operation log (a chain of OT operations). Snapshots are created rarely, only to speed up restore, not as the primary mechanism for storing versions.
An operation log is 10 to 100 times more compact than a set of snapshots. A 100KB document over 10,000 changes: snapshots = 1GB, operation log = 10 to 50MB. On top of that the log gives detailed blame: who changed what and when.
A document was created a year ago, the log has 50,000 operations. The user wants to view the version from three months ago. How does the system find that state efficiently?
Takeaways
- **OT (Operational Transformation)** every keystroke = an operation, the server transforms competing operations and serializes them into a global log
- **CRDT** mathematically guaranteed convergence without a central arbiter, operations are commutative, offline-first out of the box
- **Presence / cursors** an ephemeral channel fully separate from document sync, TTL 3 to 5 seconds, not persisted
- **Version history** = operation log + occasional snapshots for fast restore, replay operations rather than storing snapshots on every change
- **Permissions** in Cloud Spanner (strong consistency), checked on every operation, not only on session open
Related topics
Google Docs sits at the crossroads of several core areas of distributed systems
- WebSocket and SSE — transport layer for operations and presence
- CAP Theorem — choosing between consistency and availability during a network partition
- Event Sourcing — an operation log is a special case of event sourcing, state = replay of events
- Redis Pub/Sub — broadcast mechanism for presence updates between servers
Вопросы для размышления
- If you had to pick between OT and CRDT for a new collaborative editor, what questions would you ask the customer to make the right call?
- Presence data (cursors) is not saved to the database. What other data in real applications is ephemeral and why does it not need to be persisted?
- Permissions are checked on every operation, not only on session open. That is overhead. How could this check be optimized without losing security?