Databases

System Design: Inventory Management

Amazon Prime Day 2023: in the first 30 seconds, 200,000 purchase requests arrived for a PlayStation 5 listing with 5,000 units in stock. A naive SELECT + check + UPDATE sequence would have oversold by thousands. The atomic UPDATE WHERE stock > 0 pattern ensures exactly 5,000 succeed and the remaining 195,000 receive 'out of stock' - with no race conditions and no manual locking.

  • **Shopify**: optimistic locking with a version column for standard traffic, pessimistic locking (SELECT FOR UPDATE) during flash sales. The switch prevents retry storms when thousands of requests contend for the same row simultaneously.
  • **Ticketmaster**: 8-minute TTL reservation for popular concert tickets. The visible countdown timer creates urgency while the server-side TTL automatically releases abandoned reservations to the queue.

The Race Condition Problem

Inventory management must prevent overselling: the count of items sold must never exceed the count available. A naive SELECT + check + UPDATE sequence is not atomic: two concurrent transactions can both read the same available count and both proceed to decrement, resulting in negative inventory.

Amazon Prime Day 2023: in the first 30 seconds, 200,000 purchase requests arrived for a single PS5 listing with 5,000 units. The atomic WHERE guard (UPDATE ... WHERE stock > 0) ensures exactly 5,000 succeed and the rest receive 'out of stock' - regardless of concurrent throughput.

Why is the sequence SELECT stock -> check -> UPDATE not safe for inventory decrement?

Optimistic vs Pessimistic Locking

Optimistic locking detects conflicts at commit time using a version column: read the current version, update the row, check that version still matches. If another transaction updated the version first, retry. Pessimistic locking acquires an exclusive lock before reading, preventing concurrent modifications at the cost of throughput.

Shopify uses optimistic locking with a version column for standard inventory updates and switches to pessimistic locking (SELECT FOR UPDATE) for flash sales events. During a flash sale, contention is so high that optimistic locking generates massive retry storms. Pessimistic locking serializes requests but prevents wasted retry computation.

In which scenario is pessimistic locking better than optimistic?

Sharded Counters for High-Throughput Inventory

For extremely high write rates on a single counter (millions of decrements per second), even atomic operations on one row become a bottleneck due to lock contention. Sharded counters distribute the count across N shards; each write goes to a random shard; the total count is the sum of all shards.

Amazon's product inventory counters use a sharded counter approach internally. For Kindle books with millions of purchases, a single database row would be a write bottleneck. Sharding the counter distributes write pressure while maintaining accurate total counts through aggregation.

Why does a sharded counter write to a random shard instead of always the same one?

Reservation Pattern with TTL

The reservation (hold) pattern temporarily reserves inventory during a checkout window. When a user adds items to a cart and begins checkout, the item is reserved for N minutes. If payment fails or the user abandons checkout, the reservation expires and the item returns to the available pool. This prevents showing unavailable items while ensuring abandoned carts release inventory.

Ticketmaster uses a reservation pattern with an 8-minute TTL for popular concert tickets. When a user clicks 'Buy', seats are reserved immediately and a countdown timer is displayed. If payment is not completed within 8 minutes, the seats are released back to the general queue. This prevents simultaneous purchases by multiple users while allowing hesitant buyers to complete checkout.

A user adds an item to a cart and begins checkout. The item is reserved with a 10-minute TTL. The user abandons checkout without completing payment. What happens?

Key Ideas

  • **Atomic UPDATE WHERE**: UPDATE products SET stock = stock - 1 WHERE id = X AND stock > 0 is a single atomic operation. If 0 rows affected, the item is out of stock. Never use SELECT + check + UPDATE.
  • **Optimistic locking**: version column checked on UPDATE. Safe for low contention. Under high contention (flash sales), generates retry storms - switch to pessimistic.
  • **Pessimistic locking**: SELECT FOR UPDATE locks the row before reading. Serializes access, prevents retries, reduces throughput. Best for high-contention flash sales.
  • **Sharded counters**: distribute inventory count across N shards. Random writes achieve N-times throughput. Total count requires summing all shards.
  • **Reservation with TTL**: reserve inventory for N minutes during checkout. TTL automatically releases abandoned reservations. Required for Ticketmaster-style seat holds.

Related Topics

Inventory management applies core transactional database patterns:

  • ACID Transactions — Inventory decrement correctness requires atomicity. The atomic UPDATE WHERE pattern relies on the database's read-modify-write atomicity guarantee.
  • Redis Data Structures — Redis DECR and Lua scripts provide atomic inventory operations for high-throughput scenarios. Redis sorted sets track reservation expirations.
  • Payment System Design — Inventory reservation and payment are often a two-phase process. The reservation holds inventory while payment is processed; confirmed payment converts the reservation to a purchase.

Вопросы для размышления

  • A product has 1,000 units. A flash sale generates 100,000 concurrent requests. With optimistic locking (version column), how many retries does the average request make before either succeeding or failing?
  • Ticketmaster's 8-minute TTL: a user completes payment at minute 7:58. The reservation expires 2 seconds later before the payment confirmation is processed. What happens, and how should the system handle this race condition?
  • Design the inventory system for a streaming service (Netflix): 'first 1,000 subscribers get a free month'. What pattern from this lesson applies, and how does the digital goods context change the implementation?

Связанные уроки

  • dist-07-transactions
System Design: Inventory Management

0

1

Sign In