Databases
Locks and Deadlocks
2012. Knight Capital Group. A new trading algorithm was deployed to 8 servers, but the old version was left on the 7th. In 45 minutes the system executed millions of incorrect trades. USD 440 million lost. One contributing factor: concurrent updates without correct locking led to an inconsistent state. Locks are not academic. They are USD 440M in 45 minutes.
- Stripe uses advisory locks on top of PostgreSQL for idempotent payment operations: one transaction per idempotency key at any given moment
- GitHub stores pull requests in PostgreSQL; advisory locks prevent race conditions on concurrent merges - without a distributed lock manager
- Shopify found a deadlock pattern on Black Friday 2019: simultaneous discount applications conflicted. The fix: ORDER BY id for a deterministic locking order
Lock Types: Shared vs Exclusive
Locks are the coordination mechanism for concurrent transactions. Without locks: two transactions read a balance (USD 1000), both subtract USD 500, both write USD 500. Result: USD 500 instead of USD 0. This is a Lost Update - one of the classic concurrency anomalies.
Shared Lock (S-lock, reader): multiple transactions can hold an S-lock on the same row simultaneously. Nobody is modifying, everyone is reading - no conflict. Exclusive Lock (X-lock, writer): one transaction holds an X-lock, no other transaction can either read (S) or write (X). Compatibility matrix:
Intent locks are an optimization for hierarchical locking. Without intent locks: to place an X-lock on a row, all table locks must be scanned. With intent locks: an IX-lock on the table signals 'there are X-locks inside', check is O(1). PostgreSQL implements this through lwlocks (lightweight locks) inside shared memory.
Two-Phase Locking (2PL) - the classic protocol: growing phase (acquire locks), shrinking phase (release locks). After the first release, no new locks can be acquired. Strict 2PL: hold all locks until end of transaction. This guarantees serializability but maximizes contention.
Can two transactions hold an S-lock on the same row at the same time?
Row Locks and Table Locks: Granularity
PostgreSQL supports eight table lock modes. The most important: ACCESS SHARE (SELECT), ROW EXCLUSIVE (INSERT/UPDATE/DELETE), SHARE UPDATE EXCLUSIVE (VACUUM, CREATE INDEX CONCURRENTLY), ACCESS EXCLUSIVE (ALTER TABLE, DROP TABLE). ALTER TABLE blocks everything - even reads.
SKIP LOCKED is the standard pattern for concurrent job queues. Without it: N workers SELECT FOR UPDATE on the same row -> N-1 wait while the first processes. With SKIP LOCKED: each worker grabs the next available task. This is how a BullMQ with a PostgreSQL backend implements queues without Redis.
Row-level locking in InnoDB (MySQL): locks are stored in a lock table in memory, not in the row itself. Gap locks block a range that does not exist yet (prevents phantom reads at REPEATABLE READ). Next-key locks = row lock + gap lock. This is the source of unexpected deadlocks in MySQL.
Lock escalation - automatic promotion of granularity. SQL Server: if a transaction holds more than 5000 row locks, SQL Server may escalate to a table lock. This reduces lock manager overhead but increases contention. PostgreSQL does not perform lock escalation automatically.
What is SKIP LOCKED used for in SELECT FOR UPDATE?
Advisory Locks: Locks for Business Logic
Advisory Locks are locks managed by the application, not automatically by the database. Used for coordination outside transactions: distributed mutex, ensure-only-one-instance, serialization by business key.
hashtext() converts a string key to an integer for advisory locks. Problem: collisions. hashtext('user:1') may collide with hashtext('order:3827'). For production: use bigint composed of two int32 values (acting as a namespace). The two-argument form pg_advisory_lock(1, 42) uses the first integer as a namespace.
Advisory locks in PostgreSQL are stored in the shared memory lock table. Limit: max_locks_per_transaction * (max_connections + max_prepared_transactions). Defaults: roughly 6400 simultaneous locks. Under high concurrency this can be exhausted. Monitor via: pg_locks WHERE locktype = 'advisory'.
How do advisory locks differ from row locks?
Deadlock: the Circular Wait
Deadlock: transaction A holds a lock on row 1 and waits for row 2. Transaction B holds a lock on row 2 and waits for row 1. Both wait forever. PostgreSQL detects deadlocks through a wait-for graph: builds a graph of 'who is waiting for whom', searches for a cycle. By default checks every deadlock_timeout = 1 second.
deadlock_timeout in PostgreSQL defaults to 1 second. During that time, a transaction first waits for lock_timeout (infinite by default). Only after deadlock_timeout does the deadlock detector run. For OLTP: reduce deadlock_timeout to 100ms, set lock_timeout = 5s. For batch jobs: deadlock_timeout can remain at 1s.
Which pattern best prevents deadlocks?
Lock Monitoring in Production
Lock waits in production are a silent performance killer. A query hangs, a connection is consumed, the connection pool exhausts, and the application degrades. Without monitoring, lock waits are discovered only when p99 latency has already reached 10x.
Prometheus + postgres_exporter exports pg_locks counts as metrics. Alert: pg_locks{mode='ExclusiveLock',granted='false'} > 10 - more than 10 X-locks waiting. This signals a problem before users feel it. A Grafana dashboard with lock waits, deadlock rate, and average lock wait time is the production monitoring standard.
auto_explain with log_min_duration_statement logs slow queries. But lock waits are invisible in EXPLAIN - the query itself is fast, it is just waiting. pg_stat_activity + track_activity_query_size increased to 4096 is needed. CloudWatch Insights or Datadog for analyzing lock patterns in historical logs.
A high isolation level (SERIALIZABLE) automatically prevents deadlocks
SERIALIZABLE can increase deadlock frequency because more conflicts lead to serialization failures
Under SERIALIZABLE, PostgreSQL uses Serializable Snapshot Isolation (SSI) with predicate locks. When SSI detects a potential conflict, one transaction is rolled back with a serialization failure. This is semantically similar to a deadlock but occurs more often. READ COMMITTED + explicit SELECT FOR UPDATE + consistent ordering is often the better choice for OLTP
Which PostgreSQL system view shows currently waiting locks?
Related Topics
Locks tie into transactions, isolation, and distributed coordination:
- Transactions and ACID — Isolation levels are implemented through locks
- MVCC — MVCC reduces lock contention for reads
- Distributed Transactions — 2PC requires distributed lock management
Key Ideas
- S-locks are compatible; X-locks require exclusion. Intent locks enable hierarchical locking in O(1)
- SELECT FOR UPDATE SKIP LOCKED - the pattern for concurrent job queues without an external broker
- Advisory locks for business coordination outside transactions: idempotency, distributed mutex
- Deadlock: always lock in the same order. Monitoring: pg_locks + pg_stat_activity
Вопросы для размышления
- When is an advisory lock better than a row lock for business serialization, and when is it the opposite?
- How does deadlock_timeout affect performance and why should it not be set too low?
- Why is SKIP LOCKED better than FOR UPDATE for job queues, even if it means some tasks may starve?
Связанные уроки
- db-13-transactions — Transactions are the context for locks
- db-14-mvcc — MVCC reduces the need for locks
- db-16-distributed-tx — Distributed locks are the next level
- db-03-acid — ACID isolation is implemented through locks
- os-04-scheduling
- os-05-sync