Cloud Computing

Load Balancing and CDN

In 2021, Facebook lost an estimated USD 60 million over 6 hours of downtime from a BGP misconfiguration. The absence of a robust multi-region load balancer was central to the failure. AWS builds an entire stack to prevent this: ALB, NLB, CloudFront, and Global Accelerator - each solving its own problem in the reliability pyramid.

  • **Netflix** uses CloudFront to deliver video: 15% of all global internet traffic flows through Netflix's CDN, cutting buffering latency from seconds to milliseconds
  • **Twitch** uses NLB in front of RTMP streaming servers - the protocol is not HTTP, so ALB does not fit, while NLB provides sub-millisecond latency for millions of concurrent streams
  • **Salesforce CRM** routes enterprise traffic through Global Accelerator: clients in Australia connect to the nearest PoP, then traffic travels over the private AWS backbone to us-east-1, bypassing the unpredictable public internet

Application Load Balancer (ALB)

In 2016, AWS handled the Monday.com Incident: the project management site went down under sale-day load - 50 000 requests per second hitting a single EC2 instance. The root cause was not insufficient capacity but the absence of a load balancer. **Application Load Balancer (ALB)** operates at L7 (HTTP/HTTPS) and distributes requests across target groups based on URL path, headers, and even request body content. Netflix routes `/api/recommendations` to one microservice cluster and `/api/streaming` to another - using a single DNS name.

ALB supports content-based routing: path-based (`/images/*` -> S3 targets), host-based (`api.example.com` vs `www.example.com`), query-string routing, and weighted target groups for blue/green deployments. Sticky sessions are implemented via the AWSALB cookie with a configurable TTL.

Which ALB routing type routes `/api/v2/*` requests to one target group and `/static/*` to another?

Network Load Balancer (NLB)

If ALB is a smart postman that reads every letter, **Network Load Balancer (NLB)** is a courier who only looks at the address on the envelope. Operating at L4 (TCP/UDP/TLS), NLB handles millions of requests per second with latencies in the hundreds of microseconds. NLB sits in front of stock-trading platforms and game servers where an extra 10 ms costs real money. AWS guarantees static IP addresses for NLB - critical for partners who whitelist specific IPs in their firewalls.

NLB preserves the client source IP transparently to the backend (unlike ALB, which requires the X-Forwarded-For header). It supports Elastic IP per Availability Zone, giving partners a predictable static list to whitelist. Cross-zone load balancing is optional and affects cost (inter-AZ traffic is billed).

A trading platform requires partners to add fixed load balancer IPs to their firewall whitelist. Which AWS load balancer is the best fit?

CloudFront CDN

Amazon Prime Video has 200 million subscribers. Video load speed in Brazil must not depend on whether the origin server sits in us-east-1. **CloudFront** solves this through a network of over 600 Points of Presence (PoPs) in 90+ cities: users receive content from the nearest edge node, not from across the planet. The difference is physics: light travels through fiber at roughly 200 000 km/s, and 12 000 km to Brazil adds at least 60 ms of latency.

CloudFront caches based on Cache-Control and TTL. Even for dynamic requests, a PoP is still beneficial: the TCP connection terminates at the nearest edge node, then travels over AWS's optimized backbone to the origin. Invalidation costs USD 0.005 per path (first 1 000 are free). Origin Shield adds a regional cache tier between PoPs and the origin to reduce origin load.

Which CloudFront feature reduces origin load by adding an intermediate cache tier between edge nodes and the origin?

AWS Global Accelerator

At first glance, Global Accelerator looks like CloudFront - both use AWS's global PoP infrastructure. The fundamental difference: CloudFront caches content, while **Global Accelerator** routes TCP/UDP traffic directly to backend servers over AWS's private backbone, bypassing the public internet. The public internet is unpredictable: BGP routes shift, packets drop. The AWS backbone is dedicated fiber with SLA guarantees. Salesforce uses Global Accelerator for enterprise CRM traffic precisely for this stability.

Global Accelerator provides 2 static anycast IP addresses that work globally. Traffic enters the AWS network at the nearest PoP and travels over the private backbone to the endpoint. It supports health checks and automatic failover across regions within 30-60 seconds. Pricing starts at USD 0.025 per hour plus data transfer - more expensive than CloudFront, but works with any TCP/UDP protocol.

Global Accelerator and CloudFront are interchangeable - both accelerate global traffic

CloudFront accelerates via HTTP content caching; Global Accelerator accelerates via private-network routing without any caching - they address different use cases

Confusion arises because both services use AWS PoP infrastructure. The acceleration mechanism is fundamentally different: cache hits vs. a better network path.

What is the fundamental difference between AWS Global Accelerator and CloudFront CDN?

Key Ideas

  • **ALB (L7)** - smart HTTP balancer: path/host/header routing, weighted target groups for canary deployments, gRPC and WebSocket out of the box
  • **NLB (L4)** - fast pass-through balancer: sub-millisecond latency, static Elastic IPs per AZ, transparent client source IP preservation
  • **CloudFront** accelerates HTTP via caching at 600+ PoPs; **Global Accelerator** accelerates any TCP/UDP via private-network routing - different tools for different problems

Related Topics

Load balancing and CDN sit on top of AWS's core network infrastructure:

  • VPC and Network Isolation — ALB and NLB are deployed inside a VPC; subnets and security groups govern their accessibility
  • DNS and Route 53 — Route 53 points traffic to ALB/NLB via alias records; Global Accelerator provides anycast IPs instead of DNS names

Вопросы для размышления

  • An application uses WebSockets for real-time notifications and REST for data. What combination of load balancers would you choose and why?
  • Why does a CDN speed up even non-cacheable API requests - and is the acceleration worth the added cost?
  • How would failover proceed if an entire AWS region failed, given Global Accelerator is in front of two regions?

Связанные уроки

  • net-40-cdn
Load Balancing and CDN

0

1

Sign In