Real-Time Backend

Ecosystem Overview: Action Cable, Phoenix Channels, SignalR, Centrifugo

In 2014, WhatsApp on two dozen FreeBSD servers served 200 million users; at Discord three engineers on Elixir sustain 26 million simultaneous WebSocket connections to voice channels; at Basecamp DHH and a team of six push 100k concurrent connections on Action Cable. These numbers show that framework choice determines not so much the scaling ceiling as the cost a team pays to reach it. Four mature realtime stacks - Action Cable, Phoenix Channels, SignalR, Centrifugo - represent four strategies: 'embed in the framework', 'build on a concurrency-first VM', 'put RPC over WebSocket', 'extract into a separate service'. Understanding their trade-offs turns the decision into reasoning instead of fashion.

  • **Basecamp HEY**: Action Cable at ~100k concurrent connections via subdomain sharding - DHH regularly publishes architectural posts
  • **Discord**: Phoenix Channels on Elixir for voice channels, 26+ million concurrent WebSockets; the engineering blog covers ETS and Phoenix.Presence in depth
  • **Microsoft Teams**: SignalR + Azure SignalR Service for presence and chat, plus gRPC streaming for video metadata
  • **Avito (CIS classifieds)**: Centrifugo for the notifications channel; a PHP backend publishes through the HTTP API while Centrifugo holds a million client connections

Action Cable: Ruby on Rails and Redis pub/sub

**Action Cable** shipped in Rails 5 (2016) as the standard way to give a Rails application a WebSocket channel without leaving the framework. Architecturally it is a thin layer: one Puma process runs an event loop based on nio4r (Ruby NIO), each WebSocket client is a slim `Connection` authorized via the Rails cookie session, and messages flow through **Redis pub/sub** across processes. Its main strength is tight integration with Active Record, Action View and Devise: the same `current_user` is available on a REST endpoint and inside a WebSocket channel. Its main weakness is the single-threaded Ruby GVL: on one process the ceiling is around 3-4k concurrent connections, beyond which horizontal scaling with a load balancer is required.

Rails 7 introduces an **Action Cable Subscription Adapter** for NATS and PostgreSQL LISTEN/NOTIFY, allowing teams on a strict Postgres stack to avoid Redis. LISTEN/NOTIFY throughput is below Redis pub/sub (~5-10x), but for tens of thousands of connections it is acceptable. In practice DHH (creator of Rails) runs Action Cable in Basecamp at ~100k concurrent connections via subdomain sharding.

Why does Action Cable scale relatively poorly on a single Ruby process?

Phoenix Channels: BEAM and a million connections

**Phoenix Channels** is built on a fundamentally different foundation - the **BEAM** virtual machine (Erlang VM) on which Elixir runs. BEAM was created at Ericsson for telephone switching and was designed for tens of thousands of lightweight processes per node. Each WebSocket connection in Phoenix is a separate BEAM process (~2 KB overhead), and pub/sub is built into the platform through **`Phoenix.PubSub`** with pg2/pg-distribution: a message published on one node reaches every other node without an external broker. The model is what allowed WhatsApp on FreeBSD to sustain 2 million connections per server (2013 talk) on a similar stack.

Phoenix 1.7+ adds **LiveView** - an SSR framework where HTML is rendered server-side, and a Phoenix Channel delivers diff patches to the browser. It is an alternative to React/Vue for real-time UI without writing JavaScript. Discord uses Elixir + Phoenix for voice channels: servers handle 26 million WebSocket connections with single-digit-millisecond p99 latency.

Which key BEAM capability gives Phoenix Channels its scalability?

SignalR: .NET, hubs and automatic fallback

**SignalR** is Microsoft's official realtime framework for .NET. Its main architectural feature is the concept of **Hubs**: instead of sending raw JSON messages, the client calls server-side methods (`hub.invoke('SendMessage', ...)`), and the server calls methods on the client (`Clients.All.SendAsync('ReceiveMessage', ...)`). It is RPC over WebSocket with transparent serialization. The second property is **transport fallback**: when WebSocket is unavailable (firewall, proxy), SignalR automatically downgrades to Server-Sent Events, then to Long Polling. The third is **Azure SignalR Service**: a managed offering on Azure that handles up to a million connections without managing nodes.

Unlike Action Cable and Phoenix, SignalR was designed for enterprise from day one: Azure AD integration, gRPC streaming, MessagePack-binary serialization, and a redis backplane for scaling. .NET has real preemptive threading via the CLR, so a single-node SignalR deployment is bound not by a GIL but by OS socket descriptor limits - typically 64-128k connections per node after tuning.

What is the practical value of the Hubs concept in SignalR compared to plain WebSocket messages?

Centrifugo: language-agnostic real-time service

**Centrifugo** (Alexander Emelin, 2014) takes a fundamentally different approach: not a library inside an application, but a standalone service in Go that clients connect to via WebSocket/SSE/HTTP-stream, while the backend application (in any language) publishes messages through HTTP API or gRPC. Centrifugo offloads WebSocket load from the main backend: 1 million connections per node is a standard scenario. It fits cases where the stack is already chosen (Django, Laravel, Express) and there is no desire to bolt on a full realtime engine. Centrifugo also ships with built-in features: presence, history with TTL, JWT authentication, recovery after reconnection.

Comparison: Action Cable is embedded in Rails - integration convenience versus a low scaling ceiling. Phoenix Channels offer the highest scalability but require an Elixir stack. SignalR delivers type-safe RPC and transport fallback, ideal for .NET applications. Centrifugo is a universal sidecar that works with any backend stack, at the cost of dual authentication (application -> Centrifugo via JWT). The choice is driven not by performance but by which stack is already in use and how critical scaling and built-in features are.

Realtime framework choice is determined by performance

The primary factor is the stack the team already runs and the expected scale. Framework performance differs significantly, but those differences rarely become the bottleneck below 100k concurrent connections.

All four frameworks handle typical SaaS load (1-50k connections). Phoenix is the obvious choice only when a million connections are planned. Below that threshold, the team's stack drives the choice: switching from Ruby/Rails to Elixir purely for realtime rarely pays off, while Centrifugo provides similar scalability without migration.

When is Centrifugo preferable to Action Cable / SignalR / Phoenix Channels?

Key ideas

  • **Action Cable** is embedded in Rails, light integration with Active Record/Devise, but the Ruby GVL caps a single process at ~3-4k connections; ideal when the stack is already Rails
  • **Phoenix Channels** runs on BEAM; lightweight isolated processes (~2 KB) plus built-in pg-pubsub make a million connections per node architecturally natural
  • **SignalR** offers the Hubs RPC abstraction over WebSocket with a typed interface, transport fallback (WS -> SSE -> Long Polling), and a managed Azure SignalR Service option
  • **Centrifugo** is a standalone Go service: language-agnostic, a sidecar for any backend stack, with built-in presence/history/JWT
  • **The dominant selection criterion is the team's stack, not raw performance**; below 100k connections all four options handle the load comfortably

Related topics

Realtime frameworks evolve at the intersection of several directions:

  • Scaling WebSocket — All four frameworks rely on a pub/sub backplane for horizontal scaling; Redis/NATS/pg-distribution are three typical solutions
  • Authentication — JWT, session cookies, OAuth - the choice depends on how close the framework sits to the existing authorization system (Action Cable natively to Rails sessions, Centrifugo to JWT)
  • Real-time architecture patterns — The 'realtime as a service' pattern via Centrifugo contrasts with an embedded framework in Rails/Phoenix; the choice defines the operational model

Вопросы для размышления

  • If Phoenix Channels support a million connections per node and Action Cable supports three thousand, why do most Rails projects still pick Action Cable rather than migrate to Elixir?
  • Centrifugo and Azure SignalR Service are both 'realtime as a service', yet structured very differently. What is the key difference in their operational models?
  • Which invariant helps decide between an embedded framework and a standalone service before a team hits scaling limits?

Связанные уроки

  • rt-12 — Previous RT ecosystems lesson
  • rt-14 — Framework choice opens the path to production deployment
  • net-15-tcp-basics — WebSocket builds on TCP - foundation for framework comparison
  • aie-08-streaming — LLM streaming via SSE uses the same RT push architecture
  • ds-01-intro — Distributed systems and RT ecosystems solve similar scaling problems
  • net-36-websocket
Ecosystem Overview: Action Cable, Phoenix Channels, SignalR, Centrifugo

0

1

Sign In