Real-Time Backend

Chat history and delivery

Why do WhatsApp checks go from gray to blue while Signal sometimes does not show them? Identical UX, but the architecture underneath is different.

  • **WhatsApp** (100B messages/day) stores `lastReadMessageId` per user instead of per-message receipts, saving 99% of DB writes for groups
  • **iMessage** fixed the bug where blue checks rolled back after a network change by switching to an FSM with one-way status transitions
  • **Telegram** moved to cursor pagination with Snowflake IDs because OFFSET pages drifted on new messages and produced duplicates
  • **Signal** uses sealed sender plus an encrypted Storage Service to sync history between devices without the server seeing the content

Read receipts: delivery checkmarks

Read receipts confirm that the recipient saw a message. WhatsApp shows one gray check (sent), two gray checks (delivered), two blue checks (read). Each transition is a separate signal coming from a different source.

In group chats read receipts scale poorly: 1,000 members means 1,000 separate read events. Telegram does not show read receipts at all in large groups, only an overall view count.

  • **Sent** - server ack with a server timestamp (not the device clock, since the client can lie)
  • **Delivered** - WebSocket push, or an FCM delivery report when delivered offline
  • **Read** - client event on visibility/focus, throttled to once every 5 seconds
  • In group chats store `lastReadMessageId` per user instead of counters on every message

WhatsApp shows blue checks (read) in a group chat with 500 members. How do you store that state?

Message status: a finite state machine

Message status is a finite state machine with one-way transitions. A message cannot move from `read` back to `delivered`. Violating that invariant produces visual artifacts: blue checks turning gray again.

iMessage hit this exact issue: when switching between Wi-Fi and cellular, delivery receipts arrived late and rolled blue checks back to gray. Apple fixed it with a FSM that only allows one-way transitions.

The `pending` status only exists on the sender's client: it is the optimistic UI until the server ack arrives. In the server database a message appears already in `sent` state.

A message is in the `read` state. A late delivery receipt arrives from another of the recipient's devices. What should the server do?

Cursor-based history pagination

Message history cannot be paginated with OFFSET: when new messages arrive, pages shift and the user sees duplicates or gaps. WhatsApp and Telegram use cursor-based pagination on message ID or timestamp.

Telegram uses Snowflake-style IDs: 64-bit numbers where the high bits are a timestamp. That lets you paginate either by ID or by time, and the ID stays monotonic even when replies are inserted.

  • OFFSET pagination is banned for chats: new messages shift the pages
  • Cursor by timestamp has trouble with identical timestamps (batch inserts)
  • Cursor by ID (Snowflake) is stable, monotonic, and carries a timestamp
  • Page size: 50 messages is the sweet spot, balancing RTT against payload size

The client loaded page 1 (messages 1-50) with OFFSET 0. While the user was reading, 3 new messages arrived. What happens when the client loads page 2 (OFFSET 50)?

History sync across devices

History has to sync between devices: open Telegram on a tablet after your phone and every message is there, with reads marked. That is non-trivial in offline scenarios.

Signal implements E2E sync through the sealed sender protocol: even the server does not know who is writing to whom. History is encrypted; sync runs through Storage Service, a separate service holding an encrypted blob per device. When you add a new device, history is transferred via a QR-code link with a temporary key.

Soft delete is mandatory for messages: use `deleted_at` instead of a physical DELETE. During sync a new device has to learn about deleted messages, otherwise they will reappear.

History sync is just loading all messages on app start

Sync covers three event types: new messages, status updates, and deletions (tombstones). Without tombstones, deleted messages reappear when the app is reinstalled

A DB with physical deletes cannot answer 'what changed since time X' for the deleted rows. Soft delete with `deleted_at` is the only way to communicate deletions during incremental sync

A user deleted a message ('for everyone') while their second phone was offline. An hour later the phone came online and asked for a sync. How should the server report the deletion?

Takeaways

  • **Read receipts = lastReadMessageId**: in groups store a per-user cursor, not a row per message
  • **Message status = FSM**: transitions are one-way; late receipts must not roll back
  • **Cursor pagination by ID**: OFFSET is unstable when new messages arrive, Snowflake IDs fix it

Related topics

History and delivery rely on several core patterns:

  • Snowflake ID generation — Monotonic distributed IDs for cursor pagination
  • Offline-first sync — Incremental sync when connectivity returns
  • Chat architecture — Core patterns for 1:1 and group channels

Вопросы для размышления

  • WhatsApp hides read receipts if the user disables them in settings. How does that affect the architecture - is read state still stored on the server?
  • With cursor pagination, how do you handle a scenario where the cursor message has been deleted?
  • Telegram keeps history on its servers; Signal keeps it only on devices. What trade-offs does that create for sync?

Связанные уроки

  • db-09-indexes-btree
Chat history and delivery

0

1

Sign In