Algorithms
Amortized Analysis: Worst Case Is a Lie
Цели урока
- Understand when worst-case analysis gives an inaccurate picture
- Master three methods of amortized analysis
- Apply the aggregate method to ArrayList
- Know amortized complexities of popular structures
Предварительные знания
- Big O notation
- Space complexity
Python list, Java ArrayList, C++ vector ship in every codebase on the planet. PyTorch builds tensors on top of them. Redis lists use them internally. Nginx connection queues lean on them. All three have O(n) worst-case insertion. The paradox resolves in one word: amortization.
- **PyTorch/TensorFlow** - dynamic arrays for model parameters, resize on batch expansion
- **Redis lists** - O(1) amortized RPUSH/LPUSH under heavy load
- **Splay trees** - Cisco network routers use them for LRU route caches
- **Union-Find** - Kruskal MST for spanning tree construction in datacenter networks
Tarjan and the birth of amortization
In 1985, Robert Tarjan published "Amortized Computational Complexity" in the SIAM Journal on Algebraic and Discrete Methods. Tarjan had already invented Splay Trees (1985, with Sleator) and pushed Union-Find to near-O(1). Both structures made worst-case analysis useless and average-case analysis too weak. Amortized analysis filled the gap with a guaranteed bound on sequences instead of single operations.
When Worst-Case Lies
**Python list, Java ArrayList, C++ vector** - the three most-used data structures in the industry. All three have O(n) insertion in the worst case. PyTorch stores model parameters in them. TensorFlow builds computation graphs on them. No production engineer considers this a problem.
If insertion is O(n), then n insertions should cost O(n²). In practice the total is **linear**. Not a trick - just math that worst-case hides.
| Operation # | Size | What happens | Cost |
|---|---|---|---|
| 1 | 1->2 | Resize + copy 1 | 2 |
| 2 | 2->4 | Resize + copy 2 | 3 |
| 3-4 | 4 | Simple insert | 2 |
| 5 | 4->8 | Resize + copy 4 | 5 |
| 6-8 | 8 | Simple insert | 3 |
| 9 | 8->16 | Resize + copy 8 | 9 |
**Key observation:** expensive operations (resize) happen rarely. Most insertions are cheap O(1). Worst-case looks at one operation. Amortization looks at the sequence.
After 16 insertions into an empty ArrayList, how many times did resize occur (initial size 1, growth x2)?
The Idea of Amortization
**Amortized analysis** spreads operation cost across a sequence. One step can be expensive; the average stays cheap. PyTorch's Adam optimizer updates billions of parameters one O(1) step at a time. Gradient clipping fires occasionally and costs more. Total training cost lives in the amortized number, not the worst step.
**Amortized complexity != average case!** This is a guaranteed upper bound for a sequence of operations. No probabilities. No assumptions about input distribution.
**Three methods** of amortized analysis:
- **Aggregate** - count total cost of n operations, divide by n
- **Accounting** - each operation pays a fixed price
- **Potential** - introduce a potential function for the data structure
Amortized complexity O(1) means:
Aggregate Method
**Aggregate method** is the simplest: sum the cost of n operations, divide by n. Production telemetry measures p50/p99 latency the same way - sum all request times, divide. A single timeout does not condemn a system when hundreds of thousands of requests come back fast.
**Geometric series:** 1 + 2 + 4 + ... + n = 2n - 1. This is the key to O(1) amortization!
If ArrayList grows by 2x, after 1024 insertions how many elements total were copied?
Accounting Method
**Accounting method** assigns each operation a flat charge above its real cost. Surplus banks as credit for later. Stripe layers a small margin on every transaction to build reserves for rare chargebacks. Same playbook.
**Invariant:** accumulated credit >= 0 at any point. If this holds, the amortized estimate is correct.
In the accounting method, if an operation is cheaper than its amortized price:
Potential Method
**Potential method** attaches a function Phi to the data structure. Amortized cost = real cost + delta_Phi. Think of potential as stored energy: cheap operations charge it up, expensive ones drain it.
**Potential is a formalization of credit** from the accounting method. Phi >= 0 guarantees that total amortized cost >= real cost.
The potential of a data structure must be:
Real-World Examples
Amortized analysis fits anywhere a structure prepays for future work:
| Structure | Operation | Worst-case | Amortized |
|---|---|---|---|
| ArrayList | append | O(n) | O(1) |
| Hash Table | insert | O(n) | O(1) |
| Splay Tree | find/insert | O(n) | O(log n) |
| Union-Find | union/find | O(log n) | O(alpha(n)) ~= O(1) |
| Fibonacci Heap | decrease-key | O(n) | O(1) |
| Deque (two stacks) | pop_front | O(n) | O(1) |
**Union-Find** with path compression and union by rank has amortized complexity O(alpha(n)), where alpha is the inverse Ackermann function. For all practical n, alpha(n) <= 4.
**Caution:** amortized analysis says nothing about any single operation. For real-time systems (games, trading platforms) a stray O(n) call is unacceptable - one resize during order matching costs real money.
When is amortized analysis NOT appropriate?
Amortized Analysis
- Amortization = average cost over n operations, not average case
- Aggregate: sum total, divide by n
- Accounting: pay a fixed price, accumulate credit
- Potential: Phi of structure, amortized = real + delta_Phi
- ArrayList append: O(1) amortized despite O(n) resize
- Real-time systems need worst-case guarantees - amortization doesn't apply
Where This Applies
Amortized analysis helps understand the performance of many data structures.
- Dynamic Programming — Shares reasoning about total cost over sequences
- Hash Tables — O(1) amortized insert
- Optimization — Amortized analysis as a tool for algorithm evaluation
Вопросы для размышления
- A team is building a task queue for a trading platform where every order must be processed within 1 millisecond. A colleague proposes using an ArrayList with its amortized O(1) append. What exactly is wrong with this choice - and what data structure should be used instead, and why?
Связанные уроки
- alg-01-big-o — Big O is the language amortized bounds are written in
- ds-06-hash-intro — Hash table resize is the canonical O(1) amortized example
- alg-02-space — Space-time tradeoff: resize buys speed with memory
- alg-21-dp — DP and amortization both reason about total cost, not per-step cost
- par-01 — Parallel structures change amortization - resize under a lock is expensive
- alg-04-basic-sorts — Timsort uses amortized run-merging for near-sorted inputs
- opt-01 — Algorithm optimization builds on understanding amortized cost
- ds-14-heaps-intro