PostgreSQL
Autovacuum Tuning
Instagram 2013: the database unexpectedly halted on a Sunday at 3 AM. Transaction ID wraparound. Autovacuum had been throttled 'for performance'. 2.1 billion transactions, and PostgreSQL went into read-only emergency mode. 3 hours of downtime, emergency VACUUM FREEZE. After that: autovacuum tuned aggressively, txid age monitoring every hour. Proper autovacuum is not load, it is insurance.
- **Instagram** txid wraparound incident 2013: taught the whole industry to monitor age(relfrozenxid). They now run per-table autovacuum settings for 1000+ tables
- **Cloudflare** per-table autovacuum: large events tables (1B+ rows) with scale_factor=0.001 and cost_delay=0. config tables (100 rows): default settings
- **Aiven** managed PostgreSQL: automatically applies aggressive per-table settings to tables > 10M rows, alerts on wraparound at age > 500M
Key autovacuum Parameters
**Autovacuum** is a background process that cleans dead tuples after UPDATE/DELETE and refreshes statistics for the planner. Without autovacuum, tables bloat, statistics go stale, and txid wraparound becomes possible. Correct autovacuum tuning is the foundation of a stable production DB.
**Never turn off autovacuum** even temporarily in production. Without it tables accumulate dead tuples, and txid wraparound stops the entire cluster. If autovacuum creates load, tune cost-based delay. Do not disable.
autovacuum was disabled on production for 2 weeks 'to reduce load'. Which risk is most critical?
Scale Factor: The Big Table Problem
**autovacuum_vacuum_scale_factor = 0.2** means: run VACUUM when dead tuples = 20% of the table + autovacuum_vacuum_threshold (default 50). For a 100-row table: 50 + 0.2 x 100 = 70 dead tuples. For a 100M-row table: 50 + 0.2 x 100M = 20M dead tuples! Big tables wait a very long time for VACUUM.
**General rule:** for tables > 10M rows, set autovacuum_vacuum_scale_factor = 0.01 or lower. For tables > 100M rows, 0.001 or an absolute threshold. The goal: VACUUM should complete before significant bloat accumulates.
A 200M-row table, scale_factor = 0.2. 50K UPDATE/hour. After how many hours does autovacuum fire?
Cost-Based Delay: Do Not Disturb Production
**Cost-based delay** caps autovacuum I/O load. After every autovacuum_vacuum_cost_limit 'units of work', autovacuum sleeps for autovacuum_vacuum_cost_delay ms. Reading a page = 1 unit (from cache) or 10 units (from disk). Writing = 20 units.
**On NVMe SSD** cost_delay = 2 ms is excessive. NVMe handles 500K+ IOPS. Even with delay=0 autovacuum will not saturate the disk. For SSD recommended: autovacuum_vacuum_cost_delay = 0 or 1 ms, cost_limit = 1000-2000.
Autovacuum cannot keep up with a rapidly growing table (n_dead_tup keeps rising). What gives the fastest result?
Per-Table Settings: Precision Tuning
**Different tables have different access patterns** and need different autovacuum settings. Small reference tables (rarely changed) do not need frequent vacuum. Large tables with high UPDATE rate need aggressive vacuum. Per-table settings via `ALTER TABLE ... SET (storage_parameter)`.
**Automate via a script:** monitor pg_stat_user_tables, find tables with high n_dead_tup or n_dead_tup / n_live_tup > 10%, and apply per-table settings automatically. Much better than one global setting for every table.
The `audit_log` table is INSERT-only, no UPDATE/DELETE. What autovacuum setting is optimal?
Autovacuum Monitoring
**Autovacuum monitoring** is the key to catching problems before they hit production. Main metrics: per-table bloat, time since last autovacuum, txid age per table, dead tuple count, running autovacuum workers.
**txid wraparound = emergency cluster halt.** When age(relfrozenxid) > autovacuum_freeze_max_age (default 200M transactions), PostgreSQL forces a VACUUM. When age > 2 billion, the DB enters read-only emergency mode. An alert on age > 500M is mandatory.
Autovacuum interferes with production load, better to limit it
Autovacuum is a required component. Limiting via cost_delay and scale_factor lets you balance load. Full limiting leads to bloat, stale statistics, and txid wraparound
MVCC creates dead tuples on every UPDATE and DELETE. Without cleanup, tables grow forever, seq scan slows down, indexes bloat. PostgreSQL is specifically designed around a working autovacuum. Proper cost_delay and scale_factor are tools to balance, not to throttle into uselessness
pg_class shows age(relfrozenxid) = 1.8 billion for the `transactions` table. What do you do immediately?
Key Ideas
- **autovacuum = always on**. Tune cost_delay/scale_factor, do not disable
- **scale_factor = 0.2** is catastrophic for big tables (200M rows = waiting for 40M dead tuples). For tables > 10M rows: 0.01-0.001
- **cost_delay** caps I/O. On NVMe: delay=0 or 1ms. On HDD: 2-5ms. Per-table for problem tables
- **Txid wraparound monitoring**: age(relfrozenxid) > 500M = alert. > 1.5B = VACUUM FREEZE immediately
- **Per-table settings** are the correct approach. Different tables need different settings via ALTER TABLE SET (storage_parameter)
Related Topics
Autovacuum ties tightly to several internal mechanisms:
- MVCC — Autovacuum cleans dead tuples created by MVCC on UPDATE/DELETE. Without autovacuum, MVCC accumulates garbage forever
- VACUUM and bloat — Autovacuum runs VACUUM and ANALYZE automatically. Understanding VACUUM is understanding what autovacuum does
- Txid wraparound — Autovacuum is the only defense against wraparound via FREEZE. Monitoring wraparound risk is critical
Вопросы для размышления
- An `orders` table with 100M rows, 200K UPDATE/day. autovacuum_vacuum_scale_factor = 0.2. Calculate when autovacuum will run. What scale_factor is needed for autovacuum to run every 6 hours?
- pg_stat_user_tables shows: n_dead_tup = 5M, n_live_tup = 10M, last_autovacuum = 3 days ago. Autovacuum is running (visible in pg_stat_activity). Why is it not cleaning this table? How do you diagnose?
- After a deploy with 1000 migrations (mass UPDATEs), pg_stat_user_tables shows dead_pct = 80% on the main tables. How do you recover quickly without VACUUM FULL (it blocks)? How do you prevent this in the future?