Real-Time Backend
WebSocket Security
A WebSocket connection lives for hours. In that time the JWT expires, permissions change, and attackers look for ways to open a session as someone else. Securing an HTTP request and a WebSocket connection are fundamentally different problems.
- **Discord** authenticates 10M+ concurrent WebSocket connections through the Identify payload; on token compromise the connection closes within 1 second via push revocation
- **Slack** uses one-time ticket auth for WebSocket: HTTP POST /api/rtm.connect returns a wss URL with a ticket valid for 30 seconds and bound to the IP
- **Cloudflare** in 2023 deflected an HTTP/2 Rapid Reset attack (CVE-2023-44487), partly via WebSocket; the peak hit 201 million RPS - the largest DDoS attack in history
- **Binance WebSocket API** requires HMAC signatures for trading operations; each order is signed with a private key, so compromising the WebSocket without the key does not enable trading
WebSocket Authentication
A WebSocket connection starts with an HTTP Upgrade request, the only moment when standard HTTP authentication mechanisms are available. Discord authenticates millions of WebSocket connections through a token in the first message after connect (the Identify payload). Slack uses a per-connection token in a URL parameter during Upgrade. Once the connection is up, cookies and headers are no longer accessible.
Three WebSocket authentication patterns: (1) Token in URL: wss://api.example.com/ws?token=JWT - the token shows up in server logs, not recommended for production. (2) Authorization header during Upgrade - works only if the client supports it (the browser WebSocket API does not allow setting headers). (3) First-message auth: the connection is accepted, the client sends an auth message within N seconds or is disconnected. Discord uses option 3.
- **Ticket auth**: HTTP POST /ws-ticket -> receive a one-time token -> wss://host/ws?ticket=TOKEN; the ticket lives 30s and is bound to the IP
- **Cookie auth**: the browser sends an HttpOnly cookie automatically on Upgrade; works for web clients, not for mobile
- **Close code 4xxx**: codes 4000-4999 are reserved for applications; 4001=unauthorized, 4003=forbidden, 4429=rate limit
A browser app needs to authenticate a WebSocket. The Authorization header is not available in the browser WebSocket API. Which pattern is safest?
WebSocket Authorization
Authorization in WebSocket is harder than in REST: the connection is persistent, but the user's permissions can change while they are connected. Discord manages 10M+ concurrent connections. When a user is kicked from a server, Discord must immediately close their WebSocket and remove them from every guild channel without waiting for the next message.
A capability-based approach beats role-based for real-time systems. Instead of checking roles on every op, issue a token with specific permissions (channel:read:123, channel:write:456) on connect. Permissions change only via reconnect with a new token. Slack uses this: when channel permissions change, Slack invalidates every member's WebSocket within 1 second.
- **Per-message auth check**: too expensive at high message rates; cache permissions in memory with a 30-60s TTL
- **Push revocation**: when permissions change in the DB, publish an event to Redis so every WS server closes the affected user's connections
- **Subscription validation**: check permissions not only on subscribe but on each broadcast (the recipient may have lost rights)
- **Audit log**: log every WebSocket connection with userId, IP, timestamp for forensics
A user is subscribed to a private chat's WebSocket channel. An admin removed them from the chat. When do they stop receiving messages?
Token Refresh in WebSocket
A JWT token has a TTL (usually 15-60 minutes). In a REST API the client just gets a 401 and refreshes. In WebSocket the connection is persistent, so you cannot just break it mid-stream. Discord refresh tokens expire after 7 days; Discord does not drop the connection but sends a special dispatch event telling the client to refresh without reconnecting.
Sliding expiration is an alternative: the token auto-extends on activity. If the user has not sent a message in 30 minutes, the connection closes. Convenient for chats but unacceptable for financial systems that need a strict TTL. Binance WebSocket API closes the connection at exactly 24 hours regardless of activity; the client must reconnect.
- **Proactive refresh**: the server warns 5 minutes before expiry; the client fetches a new access token via REST and sends it through WS
- **Grace period**: after the token expires give 30-60s for a refresh before closing the connection
- **Token binding**: bind the JWT to an IP or TLS fingerprint; an IP change invalidates the token immediately
- **Short-lived tokens**: 15 min for a trading API, 7 days for chat - TTL depends on the sensitivity of the operation
A WebSocket connection has been active for 2 hours. The JWT expired 10 minutes ago, the connection is not closed. What does the server do with incoming messages?
WebSocket Security Headers
A WebSocket Upgrade request is an HTTP request, and the same security headers apply. In 2023 Cloudflare found an HTTP/2 Rapid Reset attack via WebSocket: attackers opened thousands of WS connections per second through one HTTP/2 connection. Origin validation and rate limiting at the Upgrade endpoint are the first line of defense.
Cross-Site WebSocket Hijacking (CSWSH) is the WebSocket analog of CSRF: a malicious site opens a WebSocket to the target API on the victim's behalf using their cookies. Defense: validate the Origin header on the server, use a CSRF token in the first message, or use ticket-based auth. WSS (WebSocket Secure) is mandatory in production - without TLS every byte is visible to intermediaries.
- **Origin validation**: check the Origin header on Upgrade; block unknown origins
- **WSS only**: ban ws:// in production; use HSTS plus wss:// redirect
- **Max connections per IP**: 10-50 concurrent; protection against connection flooding
- **Max payload**: cap one message size (64 KB - 1 MB); protection against memory exhaustion
- **Ping/pong timeout**: close connections without pong after 30-60s; protection against zombie connections
- **CSP connect-src**: restrict which domains can open WebSocket connections
WebSocket is CSRF-safe automatically because the browser does not let you read cross-origin responses
WebSocket is not bound by the Same-Origin Policy for responses, but it is vulnerable to CSWSH: a malicious page can open a connection and send commands on the victim's behalf
CORS restricts reading cross-origin HTTP responses. WebSocket uses a different mechanism: the Origin header. If the server does not check Origin, any site can open an authenticated WS to the target API using the user's cookies. That is CSWSH (Cross-Site WebSocket Hijacking) - the WebSocket analog of CSRF
An attacker on evil.com opens a WebSocket to api.bank.com using the victim's cookies. Which header blocks this attack first?
Takeaways
- **Authentication on Upgrade**: ticket-based (one-time token through REST) or first-message auth; tokens in URLs are unsafe due to logs
- **Push revocation**: a permission change in the DB must immediately close active WebSocket connections through Redis pub/sub
- **Token refresh**: the server proactively warns 5 minutes before JWT expiry; 30-60s grace period; then forced close
- **CSWSH defense**: Origin header validation on Upgrade is the main defense against Cross-Site WebSocket Hijacking; WSS is mandatory in production
Related topics
WebSocket security builds on shared security and real-time mechanisms:
- Real-time rate limiting — Rate limiting on the WebSocket Upgrade endpoint and on message frequency is mandatory protection against flooding and abuse
- Financial trading — Trading WebSocket APIs (Binance, Coinbase) use HMAC signatures on top of TLS; compromising the connection without the key does not enable trading
- IoT Real-Time — MQTT over WebSocket (port 443) uses device TLS certificates; the same revocation patterns apply on device compromise
Вопросы для размышления
- A user logged in on their work computer and left an open tab with a WebSocket chat. 8 hours later they were fired and their AD access was revoked. What chain of events should close their WebSocket connection?
- A mobile app with WebSocket moves to the background on iOS. 3 minutes later the JWT expires. How do you properly handle the moment the app returns to foreground?
- An attacker intercepted a one-time WebSocket ticket (say, from an access log). The ticket is valid for 60 seconds. What additional measures reduce the risk of it being used?