Real-Time Backend

Low-Latency Live

Viewers leave a stream once the delay versus real time crosses 30 seconds. They move to people posting spoilers in chat. 2 seconds versus 30 seconds is the difference between interactive live and a recording.

**Twitch** moved from legacy HLS (20-30s) to LL-HLS in 2022 and latency dropped to 2-5s. For sports a WebRTC path is enabled, so viewers see the action faster than the TV broadcast delay
**TikTok LIVE** serves 100M+ concurrent viewers through WebRTC SFUs deployed in every region. Sub-200ms latency is critical for the 'gifts' feature: viewers need to see the streamer's reaction before the next gift
**YouTube Live** uses LL-HLS with a 2-second ultra-low-latency mode for live events. Standard mode (30s) turns on aggressive CDN caching and cuts distribution cost by about 3x
**Cloudflare Stream** implemented WHIP/WHEP for WebRTC ingest and egress, delivering <500 ms latency from 200+ PoPs without operators managing their own SFU

LL-HLS: partial segments

Classic HLS buffers 6-30 seconds of video into segments, which makes the latency unacceptable for live events. Low-Latency HLS (LL-HLS), standardized by Apple in 2019, slices those segments into **partial segments** of 200-500 ms and ships them to the client before the full segment finishes, via HTTP/2 Push or Blocking Playlist Reload.

Twitch moved to LL-HLS in 2022 - latency dropped from 20-30s to 2-5s while staying CDN-compatible. YouTube Live uses a similar approach and claims 2 seconds for events with ultra-low-latency mode enabled.

Blocking Playlist Reload is the key LL-HLS mechanism: the client sends a request that specifies the expected MSN (media sequence number) and part number. The server answers only when that part is ready. That removes polling and shrinks time-to-first-byte.

Real-Time Backend

Low-Latency Live

**Twitch** moved from legacy HLS (20-30s) to LL-HLS in 2022 and latency dropped to 2-5s. For sports a WebRTC path is enabled, so viewers see the action faster than the TV broadcast delay
**TikTok LIVE** serves 100M+ concurrent viewers through WebRTC SFUs deployed in every region. Sub-200ms latency is critical for the 'gifts' feature: viewers need to see the streamer's reaction before the next gift
**YouTube Live** uses LL-HLS with a 2-second ultra-low-latency mode for live events. Standard mode (30s) turns on aggressive CDN caching and cuts distribution cost by about 3x
**Cloudflare Stream** implemented WHIP/WHEP for WebRTC ingest and egress, delivering <500 ms latency from 200+ PoPs without operators managing their own SFU

Low-Latency Live

LL-HLS: partial segments

Low-Latency Live

LL-HLS: partial segments

WebRTC for streaming: WHIP/WHEP

Sub-second delivery: stack and trade-offs

Latency optimization: buffers, GOP, and ABR

Takeaways

Related topics

Вопросы для размышления

Связанные уроки