Real-Time Backend
Benchmarking
Discord runs hundreds of millions of WebSocket connections. How do its engineers know the system will survive the next 10x growth, and where it will break first?
- A k6 WebSocket test showed that p99 latency grew from 8ms to 1.2s at 5000 concurrent connections. The bottleneck was synchronous JSON.stringify on large payloads
- TechEmpower Framework Benchmarks helped Discord pick Elixir/Phoenix Channels as the foundation for the realtime layer: p99 on multi-query was more stable than Node.js at the same resource level
- uWebSockets.js claims 15M req/s, but a flame graph from a real chat showed 60% of CPU going to business logic and DB, not the WS transport
- An Artillery.io scenario with Socket.io reproduced a production incident: at 2000 concurrent users a memory leak in an event listener built up 500MB/hour. Without load testing it would only have been spotted in production
C10K and C1M
C10K is the problem of serving 10,000 simultaneous connections on a single server. In 1999, Dan Kegel framed it as the engineering wall of the era: most servers of the time hit the thread-per-connection model and collapsed at 1-2K connections. The shift to event-driven I/O (epoll, kqueue) removed that limit. Today C10K is considered solved.
C1M is the next bar: 1 million concurrent connections. uWebSockets.js claims 15M req/s on synthetic tests; production systems (Slack, Discord) hold millions of WebSocket connections through horizontal scaling and sticky sessions. One Node.js process realistically handles 100K-300K idle WS connections before running into memory limits (~32 KB per connection).
- C10K (1999): solved by event-loop + epoll/kqueue; relevant only for outdated blocking servers