Backend Transport

API Gateway and Service Mesh

Netflix has 500+ microservices. A mobile app cannot know the URL of each one. Who adds the JWT token to every request? Who stops a DDoS attack? Who logs every request? Who does canary deployment for 2% of users? API Gateway is the answer to all these questions.

  • **Kong** (formerly Mashape) is the most popular open-source API Gateway: 100M+ downloads, used at Adobe, Microsoft, and Honeywell. Plugin architecture in Lua on top of nginx.
  • **AWS API Gateway** processes hundreds of billions of requests per month for thousands of companies. Integrates with Lambda, ECS, and Cognito out of the box.
  • **Istio** is used in Google Cloud, IBM Cloud, and Red Hat OpenShift. It manages mTLS for service-to-service communication without any code changes.

API Gateway Pattern

API Gateway is a single entry point for all clients (mobile, web, partners). Instead of clients calling 50 microservices directly, the Gateway aggregates, routes, and applies cross-cutting concerns: auth, rate limiting, logging.

Netflix built Zuul Gateway to handle 1B+ requests per day. Amazon API Gateway processes hundreds of billions of requests per month.

What is the main advantage of API Gateway for microservice architecture?

Envoy Proxy

Envoy is a high-performance Layer 7 proxy developed at Lyft (2016). Written in C++, it handles HTTP/1.1, HTTP/2, gRPC, and TCP. It is the foundation of most modern service mesh solutions (Istio, AWS App Mesh, Consul Connect). Configured via the xDS API (dynamic discovery).

Envoy as sidecar: in Kubernetes, an Envoy container runs alongside each pod. It intercepts all incoming and outgoing pod traffic. The application is unaware of Envoy - it is transparent.

Why is Envoy used as a sidecar in Kubernetes rather than as a separate service?

Service Mesh and Istio

Service Mesh is an infrastructure layer for managing service-to-service communication. Istio (Google, IBM, Lyft) is the most popular service mesh: it deploys an Envoy sidecar into every pod and manages configuration through a control plane.

Alternatives to Istio: Linkerd (simpler, less overhead), Consul Connect (HashiCorp), AWS App Mesh. Istio adds ~5ms per request and ~50MB RAM per sidecar. For 1000 pods that is 50GB RAM just for sidecars.

What is the main advantage of Service Mesh over library-based approaches (Hystrix, Resilience4j)?

Rate Limiting Algorithms

Rate Limiting protects APIs from abuse and overload. Algorithms differ in accuracy and burst-traffic behavior. Token Bucket (used by Stripe, AWS) is the most flexible: it allows burst up to the bucket size, then enforces a steady rate.

Redis + Lua script is the standard implementation for distributed rate limiting. The Lua script executes atomically (no race conditions). Alternative: Redis RedisCell module (GCRA algorithm) - one command CL.THROTTLE.

A client sends 200 requests in 1 second, limit is 100 req/s with Token Bucket (capacity=100). What happens?

Observability at the Gateway

API Gateway is the ideal place for collecting observability data: all requests pass through it. The three pillars: Metrics (Prometheus), Logs (structured JSON), Traces (OpenTelemetry, Jaeger). The Gateway adds a request ID for correlation.

Kong, AWS API Gateway, and Nginx automatically generate a request ID and propagate it to upstream services via the X-Request-ID header. This allows tracing a request through the entire service chain in logs.

API Gateway and Service Mesh solve the same problem - you must choose one

API Gateway handles north-south traffic (client -> cluster), Service Mesh handles east-west (service -> service). They are often used together.

API Gateway manages external clients: auth, rate limiting, public endpoints. Service Mesh manages internal communication: mTLS, circuit breaking, internal observability. Istio + Kong is a typical production combination.

Why is P99 latency more important than average response time for monitoring API Gateway?

Summary

  • **API Gateway** - single entry point: centralizes auth, rate limiting, logging. Clients do not know the internal service topology.
  • **Envoy + Istio** - service mesh for east-west traffic: mTLS between services, circuit breaking, canary deployments without code changes.
  • **Rate Limiting** - Token Bucket for burst-tolerant APIs (Stripe), Leaky Bucket for constant rate. Redis + Lua for distributed implementation.

Related Topics

API Gateway and Service Mesh work together with other infrastructure layers:

  • Distributed Tracing and Observability — Gateway generates request ID and the first span - start of distributed trace through all downstream services
  • Transport Security — Gateway is the place for SSL termination, JWT validation, WAF; Service Mesh adds mTLS for internal communication

Вопросы для размышления

  • When does Service Mesh become excessive overhead and library-based solutions (Resilience4j, Hystrix) are simpler?
  • How does API Gateway affect latency? At what P99 latency of the Gateway itself should alternatives be considered?
  • How to implement canary deployment via API Gateway for 1% of users deterministically (the same user always sees the new version)?

Связанные уроки

  • net-64-api-gateway
API Gateway and Service Mesh

0

1

Sign In