Real-Time Backend

Cost Optimization

A startup with 5K users pays USD 800/month for its own WebSocket cluster instead of USD 50/month via a managed service. The USD 750/month gap is engineering time spent on ops instead of product.

Slack disconnects mobile clients after 3 minutes of inactivity and switches them to push notifications; at 30M DAU this saves thousands of servers because most mobile users are inactive 80%+ of the time
Discord moved from Elixir to Rust for voice WebSocket and removed 300 bytes of per-process overhead; at 5M concurrent voice connections that saved 1.5GB of RAM on a single cluster
Ably serves 250 billion messages per month; managed WebSocket is often cheaper below 50K connections once engineering time is counted

The real cost of a WebSocket connection

One idle WebSocket connection on Node.js consumes 10-50KB of heap (buffers, TLS state, HTTP parser state). With 10K connections per process, that is 100MB to 500MB just on connections. Add the application, the Node.js runtime, and the V8 heap, and 2GB of RAM fills up quickly.

Memory is not the only cost. File descriptor: every connection occupies one fd. Linux defaults to 1024 fd per process (`ulimit -n`). For a WebSocket service this needs to be raised to 100K+.

In 2020 Discord documented the migration from Elixir to Rust for voice WebSocket: each Elixir process consumed 300 bytes of overhead plus a message queue. At 5 million concurrent voice connections, that was 1.5GB just on process overhead. Rust removed it entirely.

What are the main cost components of one idle WebSocket connection?

Idle Timeout: disconnect inactive clients

A mobile client backgrounds the app. The TCP connection stays alive (NAT has not closed it yet), but the user is neither reading messages nor interacting. The server holds the fd, buffers, and state for nothing.

Idle timeout + fallback strategy: after N minutes of inactivity the server notifies the client and closes the connection. The client falls back to polling every 30-60 seconds or waits for a push notification before reconnecting.

Slack disconnects mobile clients after 3 minutes of inactivity and switches them to push notifications. At 30 million daily active users that saves thousands of servers, since most mobile users are inactive 80%+ of the time. WebSocket runs only when the app is in the foreground.

How do you implement idle timeout correctly for mobile WebSocket clients?

Serverless real-time: when WebSocket is too expensive

WebSocket requires always-on servers. Serverless functions (Lambda, Cloud Functions) do not hold state between invocations, so they cannot hold a WebSocket connection. But there are patterns that deliver real-time without WebSocket.

**Server-Sent Events (SSE)** via serverless: the client holds a long-lived HTTP connection and the server writes events. AWS API Gateway supports SSE natively through Lambda response streaming. But Lambda billing is based on function execution time. For 10K idle SSE connections that means 10K continuous Lambda invocations.

Ably (managed WebSocket platform) serves 250 billion messages per month. Pricing: USD 0.00007 per 1000 messages. At 1M messages/day that is USD 2100 per month. A self-hosted WebSocket cluster of the same volume on AWS EC2 (c5.2xlarge x10) is USD 1400/month. The 1.5x gap is often worth it because of zero operational overhead.

Why is AWS Lambda not suitable for directly hosting WebSocket connections?

Cost comparison: self-hosted vs managed

A proper cost comparison of WebSocket platforms includes not just infrastructure cost but also engineering cost: setup, monitoring, scaling, failover, upgrades. Self-hosted is cheaper at scale; managed is cheaper at the start.

Hybrid approach: managed for dev/staging (no ongoing cost), self-hosted for production. Or: managed for real-time delivery (Ably/Pusher), your own business logic in microservices.

**< 1K connections** - managed wins; self-hosted overhead is not justified
**1K-50K connections** - depends on engineering capacity and ops cost
**> 50K connections** - self-hosted is usually cheaper; managed pricing is significant at that scale
**> 1M connections** - hybrid architecture or fully self-hosted with a dedicated ops team

WebSocket infrastructure cost is just the price of the servers

Total cost includes compute, networking, engineering time for setup and ops, monitoring, failover, scaling

Engineering cost for self-hosted WebSocket at small scale often outweighs compute savings. 2-3 days of setup plus ongoing ops can cost more than a managed service for years.

At what scale does self-hosted WebSocket typically become cheaper than a managed service?

Summary

One idle connection: 10-50KB heap plus one file descriptor; at 10K connections that is 100-500MB just for buffers
Idle timeout with a 30-second warning saves resources for inactive mobile clients
Managed WebSocket (Ably, Pusher) wins below 50K connections once engineering cost is counted; self-hosted wins at greater scale
Hybrid approach: managed for delivery, your own business logic in microservices

Вопросы для размышления

How do you compute the maximum number of connections per pod given available RAM and per-connection overhead?
When should you choose Server-Sent Events over WebSocket from an infrastructure cost standpoint?
How does cost change moving from 10K to 100K concurrent connections, and what changes in architecture?

Связанные уроки

sd-03-scalability

Real-Time Backend

Cost Optimization

A startup with 5K users pays USD 800/month for its own WebSocket cluster instead of USD 50/month via a managed service. The USD 750/month gap is engineering time spent on ops instead of product.

Slack disconnects mobile clients after 3 minutes of inactivity and switches them to push notifications; at 30M DAU this saves thousands of servers because most mobile users are inactive 80%+ of the time
Discord moved from Elixir to Rust for voice WebSocket and removed 300 bytes of per-process overhead; at 5M concurrent voice connections that saved 1.5GB of RAM on a single cluster
Ably serves 250 billion messages per month; managed WebSocket is often cheaper below 50K connections once engineering time is counted

The real cost of a WebSocket connection

Memory is not the only cost. File descriptor: every connection occupies one fd. Linux defaults to 1024 fd per process (`ulimit -n`). For a WebSocket service this needs to be raised to 100K+.

What are the main cost components of one idle WebSocket connection?

Idle Timeout: disconnect inactive clients

How do you implement idle timeout correctly for mobile WebSocket clients?

Serverless real-time: when WebSocket is too expensive

Why is AWS Lambda not suitable for directly hosting WebSocket connections?

Cost comparison: self-hosted vs managed

Hybrid approach: managed for dev/staging (no ongoing cost), self-hosted for production. Or: managed for real-time delivery (Ably/Pusher), your own business logic in microservices.

**< 1K connections** - managed wins; self-hosted overhead is not justified
**1K-50K connections** - depends on engineering capacity and ops cost
**> 50K connections** - self-hosted is usually cheaper; managed pricing is significant at that scale
**> 1M connections** - hybrid architecture or fully self-hosted with a dedicated ops team

WebSocket infrastructure cost is just the price of the servers

Total cost includes compute, networking, engineering time for setup and ops, monitoring, failover, scaling

Engineering cost for self-hosted WebSocket at small scale often outweighs compute savings. 2-3 days of setup plus ongoing ops can cost more than a managed service for years.

At what scale does self-hosted WebSocket typically become cheaper than a managed service?

Summary

One idle connection: 10-50KB heap plus one file descriptor; at 10K connections that is 100-500MB just for buffers
Idle timeout with a 30-second warning saves resources for inactive mobile clients
Managed WebSocket (Ably, Pusher) wins below 50K connections once engineering cost is counted; self-hosted wins at greater scale
Hybrid approach: managed for delivery, your own business logic in microservices

Вопросы для размышления

How do you compute the maximum number of connections per pod given available RAM and per-connection overhead?
When should you choose Server-Sent Events over WebSocket from an infrastructure cost standpoint?
How does cost change moving from 10K to 100K concurrent connections, and what changes in architecture?

Связанные уроки

sd-03-scalability

Cost Optimization

The real cost of a WebSocket connection

Idle Timeout: disconnect inactive clients

Serverless real-time: when WebSocket is too expensive

Cost comparison: self-hosted vs managed

Summary

Related topics

Вопросы для размышления

Связанные уроки

Cost Optimization

The real cost of a WebSocket connection

Idle Timeout: disconnect inactive clients

Serverless real-time: when WebSocket is too expensive

Cost comparison: self-hosted vs managed

Summary

Related topics

Вопросы для размышления

Связанные уроки