Real-Time Backend
Push Notifications Architecture
WhatsApp sends billions of notifications per day to iOS and Android. How do you build a reliable pipeline through Apple APNs and Google FCM?
- **Airbnb Notification Service** - 50M notifications/day through a Kafka pipeline, FCM batch API at 500 tokens per request cuts HTTP overhead 100x, automatic cleanup of invalid tokens
- **WhatsApp** uses priority: 'normal' for most notifications (battery savings), 'high' only for calls. That matches APNs requirements for VoIP pushes with immediate wake.
- **Duolingo** - monthly cleanup of 30-40% of dead tokens via the APNs Feedback Service, cutting blast cost by 35% and speeding up batch sends
APNs (Apple Push Notification service)
**APNs** is Apple's service for delivering push notifications to iOS, macOS, watchOS, and tvOS devices. The architecture: app server -> APNs -> device. APNs keeps a persistent TLS connection with each device (through Apple Data Centers) and delivers a push within milliseconds when the device is online.
APNs authentication has two flavors: **certificate-based** (p12/PEM file, tied to a single bundle ID, expires after a year) and **token-based** (JWT with ES256 signature, one .p8 key for all app bundle IDs, no expiration). Every large company uses token-based: one key manages Uber, Uber Eats, and Uber Freight on a single Apple Developer account.
APNs delivers 5B+ notifications per day. Delivery guarantee: if the device is offline, APNs holds the last notification per bundle ID for up to 30 days (or the TTL set by the `apns-expiration` header). With `apns-collapse-id`, multiple notifications collapse into one.
How is token-based authentication on APNs better than certificate-based?
FCM (Firebase Cloud Messaging)
**FCM** is Google's push service for Android (via legacy GCM and FCM v1 API), Web (via Service Workers and the Web Push Protocol), and iOS (FCM proxies through APNs). A single API across platforms is the main advantage over wiring up APNs + Web Push separately.
FCM v1 API (replaced legacy in 2023) uses OAuth 2.0 with a service account JSON. The payload is split into **notification** (rendered by the system automatically) and **data** (handled by the app). The `notification` object differs across Android, iOS, and Web - FCM normalizes the differences. If the device is offline, FCM holds notifications for up to 4 weeks.
WhatsApp processes 100B+ messages per day, a large fraction of which trigger a push via FCM/APNs. To control cost they use priority: 'normal' for most notifications (battery-friendly) and 'high' only for incoming calls.
How does FCM deliver push notifications to iOS?
Push Pipeline
**Push pipeline** is the chain from a business event to a notification on the device. Components: event source (user or system action) -> message queue (Kafka, RabbitMQ) -> notification service (enrichment, targeting) -> delivery service (FCM/APNs client) -> platform (APNs/FCM) -> device.
The queue between event and send is critical at scale. Without it, a spike (a 10M-user promo launch) topples the service under simultaneous APNs/FCM requests. With Kafka, events are written instantly while N workers read and send at a bounded rate. Instagram uses this exact pattern for bulk notifications.
Airbnb has published its architecture: the Notification Service handles 50M notifications per day through Kafka. Workers consume from the topic and batch requests to FCM (up to 500 tokens per batch via the FCM batch API), cutting HTTP overhead by 100x.
Why put notifications into Kafka before sending, rather than send to FCM directly from the event handler?
Push Delivery and Reliability
APNs and FCM return response codes that drive downstream logic. **410 Gone** from APNs or `UNREGISTERED` from FCM means the token is invalid: the device reinstalled the app or was reset. That token must be deleted from the DB immediately. Otherwise dead tokens pile up and costs grow.
**Delivery receipt**: APNs and FCM do not guarantee delivery to the device, only to their own servers. For critical notifications (calls, payments) a second channel is used: the app confirms receipt via API on open, the server sets a timer, and if no confirmation arrives, falls back to SMS or email.
Duolingo reports that 30-40% of tokens in their production DB are invalid (devices reset, apps reinstalled). Monthly cleanup via the APNs Feedback Service cuts sending cost by 35% and speeds up batch blasts.
Push notification = guaranteed message delivery to the user
Push notification = best-effort delivery: APNs/FCM guarantee acceptance of the request but not display on screen - offline device, DND, invalid token, system limits
Critical notifications (bank transactions, emergency alerts) require application-level delivery confirmation + SMS fallback. That is a separate layer on top of push.
An app sends a critical payment notification. APNs returns 200 OK. Can we conclude that the user saw the notification?
Summary
- **APNs** (iOS) and **FCM** (Android/Web) are two mandatory channels; FCM proxies to APNs for iOS, letting you use a single API
- **Push pipeline**: event -> Kafka queue -> notification worker (enrichment) -> FCM/APNs -> device; the queue is critical for spikes
- **Delivery != display**: APNs 200 OK = accepted, not shown; critical notifications need an application-level receipt + SMS fallback; dead tokens = wasted money
Related Topics
Push notifications are part of a broader notification system:
- In-App Notifications — Push delivers when the app is closed; the in-app system shows notifications inside the app in real time
- Message Queues (Kafka) — Kafka buffers events before sending - protection against spikes during bulk blasts
- WebSocket — In-app real-time notifications go over WebSocket; push is the fallback when the app is closed
Вопросы для размышления
- An app stores device tokens in PostgreSQL. After 3 months in production it has 2M tokens, ~40% of which are invalid. How do you build automated cleanup via the APNs Feedback Service?
- You need to send a marketing notification to 10M users 15 minutes before a campaign launch. How many workers do you need given an FCM limit of 600 requests/sec and a batch size of 500 tokens?
- An iOS user enabled Focus Mode. The notification is sent, APNs returns 200. The user does not see the notification. How should the app handle this situation?