Backend Transport

Backpressure and Flow Control

Node.js server, 10K requests per second, each goes to a queue. Consumer processes 1K/s. After 10 seconds: 90K messages in memory. OOM killer fires, service crashes, restarts, OOM again. Classic backpressure death spiral. No extra servers needed - just bounded queues.

**Kafka Consumer** uses explicit polling as implicit backpressure: the consumer controls consumption rate. If processing is slow, it polls less frequently.
**gRPC** implements per-stream flow control via HTTP/2 WINDOW_UPDATE frames. A slow server sends smaller window values, the client automatically slows down.
**Node.js streams** implement backpressure through pause/resume. pipe() propagates backpressure automatically across multiple pipeline stages.

What is Backpressure

Backpressure is a mechanism where the consumer signals the producer to slow down. Without backpressure a fast producer overflows the buffers of a slow consumer: memory is exhausted and the system crashes.

Twitter open-sourced backpressure for Reactive Streams in 2013. The problem appears everywhere: Kafka consumer lag, Node.js writable streams, gRPC flow control, TCP receiver window.

A system with an unbounded queue starts consuming all available memory. This is called:

TCP Flow Control

TCP has built-in backpressure via the receiver window (rwnd). The receiver advertises how many bytes it can accept. The sender may not send more than rwnd. When the buffer fills, rwnd drops to zero and the sender stops.

A common mistake: large TCP buffers at OS level (net.core.rmem_max) can hide backpressure problems. The app thinks data is consumed but it accumulates in kernel buffers.

The receiver sets rwnd=0. What happens to the TCP sender?

Reactive Streams

Reactive Streams is a specification (2013) for asynchronous stream processing with non-blocking backpressure. Implemented in RxJava, Project Reactor (Spring), Akka Streams. Subscriber controls the data rate through request(n).

Project Reactor is the foundation of WebFlux (Spring 5+). Subscriber.request(n) is the backpressure mechanism: subscriber requests N elements, publisher produces no more.

How does the Reactive Streams subscriber signal readiness for more data?

Bounded Queues and Rejection Policies

A bounded queue has a fixed capacity. When full, four policies apply: Block (wait for space), Drop Oldest, Drop Newest, or CallerRuns (execute in the calling thread - natural backpressure).

CallerRunsPolicy in Java ThreadPoolExecutor is a clean form of backpressure: the calling thread executes the task directly, naturally slowing down the producer. No extra threads, no OOM.

Why is CallerRunsPolicy in ThreadPoolExecutor a form of backpressure?

Load Shedding

Load Shedding is deliberately dropping requests when the system is overloaded rather than letting all requests degrade slowly. Better to serve 80% of users well than 100% poorly.

HTTP 503 Service Unavailable with Retry-After header is the correct signal for load shedding. Clients should implement backoff and retry after the specified interval.

Backpressure and rate limiting are the same thing

Backpressure is a signal from consumer to producer to slow down. Rate limiting rejects requests that exceed a defined threshold.

Backpressure is cooperative: the producer adapts to capacity. Rate limiting is a hard boundary. Both are needed in production.

Why is load shedding preferable to serving all requests when the system is overloaded?

Summary

**Backpressure** is a signal from consumer to producer: slow down. Without it - buffer bloat and OOM on the consumer side.
**TCP flow control** via rwnd is backpressure at the network level. HTTP/2 and gRPC add per-stream flow control on top.
**Bounded queues + rejection policies** - CallerRunsPolicy for natural backpressure, Drop for load shedding. Unbounded queues are an anti-pattern.

Вопросы для размышления

How does Kafka Consumer Group handle a situation where one partition has 1M messages but others are empty?
When should load shedding occur at the API Gateway level rather than inside the service?
Reactive Streams request(n) allows requesting exactly N elements. What is the optimal N for a database-to-Kafka migration pipeline?

Связанные уроки

alg-20-greedy

Backend Transport

Backpressure and Flow Control

**Kafka Consumer** uses explicit polling as implicit backpressure: the consumer controls consumption rate. If processing is slow, it polls less frequently.
**gRPC** implements per-stream flow control via HTTP/2 WINDOW_UPDATE frames. A slow server sends smaller window values, the client automatically slows down.
**Node.js streams** implement backpressure through pause/resume. pipe() propagates backpressure automatically across multiple pipeline stages.

What is Backpressure

Twitter open-sourced backpressure for Reactive Streams in 2013. The problem appears everywhere: Kafka consumer lag, Node.js writable streams, gRPC flow control, TCP receiver window.

A system with an unbounded queue starts consuming all available memory. This is called:

TCP Flow Control

A common mistake: large TCP buffers at OS level (net.core.rmem_max) can hide backpressure problems. The app thinks data is consumed but it accumulates in kernel buffers.

The receiver sets rwnd=0. What happens to the TCP sender?

Reactive Streams

Project Reactor is the foundation of WebFlux (Spring 5+). Subscriber.request(n) is the backpressure mechanism: subscriber requests N elements, publisher produces no more.

How does the Reactive Streams subscriber signal readiness for more data?

Bounded Queues and Rejection Policies

A bounded queue has a fixed capacity. When full, four policies apply: Block (wait for space), Drop Oldest, Drop Newest, or CallerRuns (execute in the calling thread - natural backpressure).

CallerRunsPolicy in Java ThreadPoolExecutor is a clean form of backpressure: the calling thread executes the task directly, naturally slowing down the producer. No extra threads, no OOM.

Why is CallerRunsPolicy in ThreadPoolExecutor a form of backpressure?

Load Shedding

Load Shedding is deliberately dropping requests when the system is overloaded rather than letting all requests degrade slowly. Better to serve 80% of users well than 100% poorly.

HTTP 503 Service Unavailable with Retry-After header is the correct signal for load shedding. Clients should implement backoff and retry after the specified interval.

Backpressure and rate limiting are the same thing

Backpressure is a signal from consumer to producer to slow down. Rate limiting rejects requests that exceed a defined threshold.

Backpressure is cooperative: the producer adapts to capacity. Rate limiting is a hard boundary. Both are needed in production.

Why is load shedding preferable to serving all requests when the system is overloaded?

Summary

**Backpressure** is a signal from consumer to producer: slow down. Without it - buffer bloat and OOM on the consumer side.
**TCP flow control** via rwnd is backpressure at the network level. HTTP/2 and gRPC add per-stream flow control on top.
**Bounded queues + rejection policies** - CallerRunsPolicy for natural backpressure, Drop for load shedding. Unbounded queues are an anti-pattern.

Вопросы для размышления

How does Kafka Consumer Group handle a situation where one partition has 1M messages but others are empty?
When should load shedding occur at the API Gateway level rather than inside the service?
Reactive Streams request(n) allows requesting exactly N elements. What is the optimal N for a database-to-Kafka migration pipeline?

Связанные уроки

alg-20-greedy

Backpressure and Flow Control

What is Backpressure

TCP Flow Control

Reactive Streams

Bounded Queues and Rejection Policies

Load Shedding

Summary

Related Topics

Вопросы для размышления

Связанные уроки

Backpressure and Flow Control

What is Backpressure

TCP Flow Control

Reactive Streams

Bounded Queues and Rejection Policies

Load Shedding

Summary

Related Topics

Вопросы для размышления

Связанные уроки