RabbitMQ
TTL: message, queue, and per-publish expiration
A startup builds a 'send-an-email-in-30-minutes' feature using TTL+DLX. The retry queue is classic, with `x-message-ttl: 1800000`. Tests pass. In production, a marketing batch publishes 50k messages with per-message expiration='5000' for an urgent flash sale, into the same queue. None of the urgent emails go out for 30 minutes; they sit lazily behind the slow ones. Whose fault is it: the developer, the docs, or the lazy TTL? Trick question: the fix is to know all three flavours of TTL and which queue type evaluates them eagerly.
- **RabbitMQ delayed message exchange plugin** was open-sourced by the core team because the TTL+DLX workaround was so widely used and so easy to misconfigure.
- **MassTransit** (a popular .NET messaging framework) builds its scheduling subsystem on TTL+DLX as the default, and on the delayed-message plugin as an opt-in upgrade.
- **Postmark** publishes engineering posts on using x-expires for per-tenant queues that auto-clean as customers churn out, avoiding a manual cleanup cron.
Предварительные знания
- DLX wiring and dead-letter routing
- Difference between queue arguments and message properties
- Awareness of queue types (classic, quorum)
TTL grew out of AMQP, the delayed plugin out of pain
AMQP 0-9-1 specified per-message `expiration` but no queue-level cap. RabbitMQ added `x-message-ttl` and `x-expires` as proprietary extensions early on (around 2.0) to support the common 'expire after N seconds' pattern without forcing publishers to stamp every message. The TTL+DLX delayed-message trick was discovered by users, not designed by the team; the official delayed-message exchange plugin landed in 3.5 as an explicit response to thousands of community blog posts explaining the workaround.
Three flavours of TTL
RabbitMQ exposes time-to-live at three different scopes, and mixing them up is one of the top sources of weird production behaviour. `x-message-ttl` is a queue argument: every message published into the queue dies after that many milliseconds. The `expiration` property is per-message: the publisher overrides TTL for a single payload, in milliseconds, encoded as a string. `x-expires` is the **queue's own TTL**: an idle queue (no consumers, no publishes) is auto-deleted after that duration.
| Mechanism | Scope | Where set | Failure mode if misused |
|---|---|---|---|
| x-message-ttl | Every message in this queue | Queue declaration arguments | Mixed TTLs blocked at head on classic queues |
| expiration property | Single message | Publisher BasicProperties (string ms) | Forgetting the string type silently breaks expiry |
| x-expires | The queue object itself | Queue declaration arguments | Queue vanishes mid-deploy if no producer for a while |
The `expiration` property must be a **string** of milliseconds. Passing an integer is a common pika/amqplib mistake: the broker silently treats the unknown type as 'no expiration' and your delayed-message pattern stops working. Always `str(ms)` before assigning.
A queue is declared with `x-message-ttl: 30000`. A publisher sends a message with `expiration='60000'`. When does the message expire?
Lazy on classic, eager on quorum
Classic queues check TTL only when a message reaches the head of the queue (lazy expiry). Quorum queues evaluate TTL eagerly, scanning the log periodically and dead-lettering expired entries even if they sit deep in the backlog. The choice matters whenever per-message TTLs vary inside a single queue or when timely DLX routing of stale work is critical.
The lazy-vs-eager distinction is the single biggest reason teams get bitten when migrating retry topologies from classic to quorum (or vice versa). A retry queue that 'worked' on classic because every message had the same TTL behaves very differently when one bad publisher starts sending short-TTL pokes.
If you must keep classic queues but need eager-looking expiry, declare **one queue per TTL tier** (`retry.1s`, `retry.5s`, `retry.30s`). Each queue has uniform TTL, so lazy evaluation does not cause head-of-line blocking.
A classic retry queue has `x-message-ttl: 10000`. Messages publish in this order: A (no override), B (`expiration='2000'`), C (`expiration='1000'`). When does C dead-letter?
TTL + DLX = delayed messages
The most popular use of TTL is not actually message lifecycle management; it is delayed delivery. Publish a payload into a buffer queue whose only job is to hold it for the TTL; when TTL fires, the broker dead-letters it into the real work queue. The receiver sees the message exactly when the delay has elapsed. This is how to schedule a reminder or back off a retry without an external scheduler.
For arbitrary per-message delays at large scale, install the official `rabbitmq_delayed_message_exchange` plugin: a new exchange type `x-delayed-message` keeps payloads on disk with a delay header until ready. It avoids the lazy-TTL trap entirely and scales to millions of pending messages.
Delayed-message via TTL+DLX is a transport-level trick, not a real scheduler. It cannot survive broker upgrades that change queue type, it ignores wall-clock skews on restart, and it cannot cancel a scheduled message. For human-facing reminders with strict timing, layer an idempotent scheduler service on top.
You build a `retry.5s` queue with `x-message-ttl: 5000` and DLX back to the work queue. A producer publishes a message with `expiration='1000'` into `retry.5s`. The retry queue is classic. After how long does the message reach the work queue?
x-expires: the queue lifecycle TTL
`x-expires` is a separate concept: how long the **queue object** lives without activity. It is meant for short-lived per-session queues; RPC reply queues, per-WebSocket fan-outs, scratch queues for ephemeral subscribers. It exists to stop your broker from drowning in orphaned per-user queues over time. It does not affect any individual message's TTL.
| Use case | x-expires sensible? | Why |
|---|---|---|
| RPC reply queue, per request | Yes (15-60s) | Lifetime is bounded by the RPC timeout |
| Per-session UI subscriber | Yes (5-15m) | Cleans up after user closes the tab |
| Main work queue (orders, payments) | Never | Quiet periods would silently drop the topology |
| DLQ | Maybe (14-30d) | Cold storage; pair with archival job before expiry |
If you do set `x-expires` on a long-lived queue, make sure your producer's queue_declare runs on every connection (which is idempotent). Otherwise a slow Sunday plus a producer that does not redeclare equals a Monday morning of unbound traffic flowing to the default exchange.
x-message-ttl and x-expires are basically the same thing at different scales
They are completely independent concerns: x-message-ttl ages messages out of a long-lived queue; x-expires ages the queue object itself out of the broker.
Confusing them is how production loses durable queues over a long holiday. Always document which one you mean in your queue declarations, and never set x-expires on infrastructure queues that must always exist.
A team sets `x-expires: 86400000` (24h) on every queue, including the main `orders.work` queue, to 'keep the broker tidy'. What happens on a quiet weekend?
Take it home
- Three TTLs, three scopes: `x-message-ttl` (queue-level), `expiration` (per-message string-ms), `x-expires` (queue lifetime).
- Per-message + queue-level TTL combine as `min(per_message, queue_ttl)`. Smaller value wins.
- Classic queues evaluate TTL lazily at the head; quorum queues evaluate eagerly. Mixed-TTL designs need quorum.
- TTL+DLX is the canonical delayed-delivery hack. Install the delayed-message plugin for arbitrary per-message timing at scale.
- Never set `x-expires` on an infrastructure queue; a quiet weekend will delete it.
Where this leads
TTL is a foundational primitive for retry, scheduling, and per-session topology. The next lesson revisits all of it through the lens of queue types.
- Quorum vs classic vs streams — Queue type changes whether TTL is lazy or eager, and whether DLX even exists
- Acknowledgements: ack, nack, requeue, multiple — TTL-driven dead-letter is the broker-side equivalent of nack(requeue=false)
Связанные уроки
- db-19-redis — Redis EXPIRE provides the same per-key TTL model, useful for comparison.
- sd-07-caching — TTL is a basic cache invalidation strategy, the pattern carries over to queues.
- rmq-10-dlx-dlq — TTL is often combined with DLX to build delayed retry and parking lot patterns.