Databases
Transactions and Isolation Levels
2014. Bitcoin exchange Mt. Gox announces bankruptcy, losing 850,000 BTC. Audits showed: transaction malleability + a race condition in withdrawal handling. When users pressed 'withdraw', the system recorded the transaction, but under a specific concurrency interleaving the same balance was debited twice. That is not a blockchain bug - it is classical application-level write skew. The database isolation level is worth billions, and most developers learn about it by reading a post-mortem.
- **Stripe and fintech**: payment transactions run at Serializable - retry up to 5 times with jitter; the cost of a 'could not serialize' error <<<<< the cost of losing $0.01 on a customer account
- **Slack, GitHub and ticket systems**: handling concurrent operations on the same issue/comment is classic write skew; solved via SELECT FOR UPDATE on issue_id or SERIALIZABLE
- **Airbnb and booking systems**: double-booking solved via row-level locks on calendar entries + Repeatable Read; before this approach - a few well-known post-mortems
Isolation Levels
The SQL-92 standard defines 4 isolation levels in order of increasing strictness: Read Uncommitted, Read Committed, Repeatable Read, Serializable. Each level permits or forbids specific anomalies: dirty read (reading uncommitted data), non-repeatable read (the same read returns a different result inside one transaction), phantom read (new rows appear in a repeated range query). Serializable forbids all three - the result of concurrent execution must be equivalent to some serial order of the transactions.
Important nuance: real DBMSes deviate from the standard. PostgreSQL has no Read Uncommitted (it's treated as Read Committed). PostgreSQL's Repeatable Read is actually snapshot isolation - stronger than SQL-92 requires, but still not Serializable because of write skew. Oracle implements neither Read Uncommitted nor Repeatable Read - only Read Committed and Serializable (effectively snapshot). MySQL InnoDB Repeatable Read is special: it uses gap locks to block phantoms but does not prevent write skew.
A team launches PostgreSQL with the default isolation level. Which anomalies are possible?
Dirty and Phantom Reads
**Dirty read** - transaction T1 reads a row modified by T2 that hasn't yet committed. If T2 rolls back, T1 operated on phantom data. Most modern DBMSes (PostgreSQL, Oracle) forbid dirty reads even at the weakest level. **Phantom read** is more insidious: T1 runs 'SELECT COUNT(*) WHERE x > 100', between queries T2 inserts a new row with x=150, and the second COUNT returns a different number. SQL-92 Read Committed and Repeatable Read do not block this.
Inside the DBMS phantoms are handled by different mechanisms: Lock-based (2PL + range locks): MySQL InnoDB places gap locks on the read range - INSERT into that range blocks until COMMIT. MVCC (snapshot isolation): PostgreSQL gives a transaction a consistent snapshot at BEGIN - new INSERTs are invisible. Predicate locking: PostgreSQL Serializable tracks query predicates and detects potential conflicts at COMMIT (SSI - Serializable Snapshot Isolation, invented by Cahill in 2008).
A reporting query computes daily revenue via 3 consecutive SELECTs. Which isolation level guarantees that all 3 queries see the same data?
Snapshot Isolation
PostgreSQL Repeatable Read is **snapshot isolation**. On BEGIN the transaction takes a virtual database snapshot: it sees only data committed before that moment. This is implemented through MVCC (Multi-Version Concurrency Control): every UPDATE/DELETE does not overwrite the row but creates a new version with two markers - xmin (who created it) and xmax (who deleted it). A transaction sees versions with xmin < snapshot_id and (xmax = 0 OR xmax > snapshot_id). Old versions are removed by VACUUM.
Strengths of snapshot isolation: (1) readers don't block writers and vice versa - ideal for analytical queries on top of OLTP load; (2) read consistency within one transaction; (3) high throughput. Weaknesses: (1) storage overhead for versions until VACUUM (table bloat); (2) write skew - see next section - is not prevented; (3) concurrent UPDATEs of the same row cause one transaction to receive 'could not serialize' and to retry.
How does PostgreSQL provide snapshot isolation at the Repeatable Read level?
Serializable and SSI
Serializable is the strictest level: the result of any set of concurrent transactions must be equivalent to some serial order of their execution. The classical 2PL (Two-Phase Locking) implementation with read and write locks carries huge overhead. In 2008 Cahill et al. proposed **SSI (Serializable Snapshot Isolation)**: an extension of snapshot isolation that tracks rw-conflicts between transactions (one reads what another writes) and aborts transactions on COMMIT if a 'dangerous cycle' is detected.
Comparison of approaches: 2PL Serializable (DB2, SQL Server lock-based) - readers block writers, low throughput; SSI Serializable (PostgreSQL since 9.1) - optimistic approach on top of snapshot isolation, minimal locking but transactions may abort with 'could not serialize access due to read/write dependencies'. Applications must retry on error 40001. CockroachDB and FoundationDB use SSI-style approaches.
An application uses Serializable in PostgreSQL. On COMMIT it receives 'could not serialize access'. What should be done?
Write Skew
**Write skew** is an anomaly where two transactions read an overlapping set of rows, decide based on what they read, and write to different rows - breaking a business invariant. Classic example: a hospital requires at least one doctor on-call. T1 and T2 read that Alice and Bob are on-call (invariant holds with two), and concurrently take leave. Each transaction thinks 'one will remain', both commit - and nobody is left. Snapshot isolation does not catch this because both transactions wrote to different rows.
Write skew is the hardest anomaly to track down in production: typical 'check then act' code looks correct in single-threaded tests but breaks invariants under load. Known cases: banking systems (double withdrawal), booking systems (double-booking a slot via places + reservations split), rate-limit counters. If business logic is built on 'I just read it and now I'll write' - think about write skew. Solutions: SERIALIZABLE, SELECT FOR UPDATE on read rows, or materialise the conflict.
If the DBMS supports ACID, my transactions are automatically serializable
ACID guarantees the atomicity of each transaction but not the serializability of their concurrent execution. Isolation level is an explicit choice
Most DBMSes default to Read Committed or Snapshot Isolation - both allow write skew and/or phantom reads. Serializable is available everywhere but pays in performance and requires retry logic in the application. ACID is not autopilot - it is a contract that must be understood.
A booking system shows comma-separated timeslots free at 14:00. Two users see 'free' and book simultaneously - both get the slot. This is:
Key Ideas
- **SQL-92 isolation levels** (Read Uncommitted -> Serializable) define a spectrum of trade-offs between performance and correctness; real DBMSes deviate from the standard
- **MVCC and snapshot isolation** (PostgreSQL Repeatable Read) provide consistent reads without locking via row versioning - ideal for read-heavy workloads
- **SSI (Serializable Snapshot Isolation)** optimistically detects rw-conflicts at COMMIT and aborts transactions with 40001 - the application must implement retry logic
- **Write skew** is a non-obvious anomaly not prevented by snapshot isolation: transactions read an overlap, write to different rows, break a business invariant. Resolved by SERIALIZABLE or explicit locks
Related Topics
Mt. Gox from the opening lost billions to write skew - and this anomaly became the canonical example of where isolation-level theory meets production. The transaction model is the foundation underneath indexes, journals and distributed DBMSes:
- MVCC and Indexes — Snapshot isolation is only possible with MVCC; indexes must support version visibility via xmin/xmax - this shapes the design of PostgreSQL index entries
- Distributed Transactions — In a distributed setting Serializable requires consensus protocols (Spanner, FoundationDB) or 2PC; the cost is orders of magnitude higher than in a single DBMS
- Query Optimization — Isolation level affects the query plan: at RR/Serializable the planner may pick a different index because only versions before the snapshot are visible
Вопросы для размышления
- PostgreSQL defaults to Read Committed, MySQL defaults to Repeatable Read. What historical and architectural reasons led to different defaults, and how does this affect application portability?
- Mt. Gox lost billions to write skew. Had they used SERIALIZABLE with retry logic, what would they have paid for it? Would the price of preserving funds have been worth it?
- Snapshot isolation requires periodic VACUUM to clean obsolete versions. What happens if VACUUM cannot keep up with the UPDATE load? Which failure mode should be expected?
Связанные уроки
- db-03-acid — ACID - theoretical base for understanding isolation
- db-14-mvcc — MVCC - the mechanism that implements Snapshot Isolation
- db-15-locks — Locks implement Serializable without MVCC
- dist-07-transactions — Distributed transactions - 2PC on top of the same isolation levels