Data Structures

Heap: O(1) Extremum, O(log n) Everything Else

Предварительные знания

Trees and tree terminology

Intro to trees

Every GPS route runs Dijkstra. Every Dijkstra step pops from a min-heap. On a city-scale graph without a heap, that pop costs O(n) - routes load in 50 seconds. With a heap: O(log n) - 50ms. Beam search in GPT, A* in game AI, Prim's MST, Huffman compression: all depend on one primitive - O(1) minimum access, O(log n) insert.

Dijkstra shortest path - heap extracts minimum-distance node at every step
LLM beam search - heap of candidate sequences sorted by log-probability
OS task schedulers - priority queue selects highest-priority runnable thread
Huffman compression - heap builds optimal prefix codes from character frequencies
A* pathfinding in games - heap ordered by f = g + h heuristic

The heap was invented for sorting

In 1964 J. W. J. Williams introduced the binary heap as the engine for a new sorting method, heapsort. The same year Robert Floyd published the O(n) bottom-up build-heap procedure, the heapify that turns an arbitrary array into a heap in linear time rather than n separate O(log n) inserts. Two decades later, in 1984, Michael Fredman and Robert Tarjan designed Fibonacci heaps, which push decrease-key down to amortized O(1) and improve the theoretical bound of Dijkstra and Prim, though their constant factors keep binary heaps dominant in practice.

The Priority Queue Problem

Beam search - the decoding algorithm used by every production LLM during inference - maintains a priority queue of candidate token sequences ranked by log-probability. At each decoding step it pops the highest-scoring sequence, extends it by one token, and pushes the new candidates back. The heap is touched once per token per beam. For GPT-4 with beam width 4, that is millions of heap operations per second.

The core tension: a sorted array gives O(1) minimum but O(n) insert. An unsorted array gives O(1) insert but O(n) minimum. A BST balances both at O(log n) but requires pointer-chasing and rebalancing bookkeeping. A heap is the specialized answer for this exact trade-off.

The heap wins on the metric that matters most: peek (find min) is O(1). Insert and delete are O(log n). This is the optimal for a comparison-based priority queue - proven by information-theoretic lower bounds.

The main advantage of a heap over a sorted array for a priority queue:

The Heap Property: A Relaxed Total Order

A min-heap enforces one invariant: every node is less than or equal to its children. That is it. Siblings have no relationship to each other. This weaker-than-sorted constraint is why insert costs O(log n) instead of O(n) - only one path from the insertion point to the root needs to be checked.

A heap is always a *complete* binary tree - every level is fully filled except possibly the last, which fills left to right. This structural constraint is what makes array storage possible and cache-friendly.

Python's `heapq` is a min-heap. For a max-heap: negate values on push, un-negate on pop. This `(-value)` pattern appears in virtually every competitive programming solution and production scheduler written in Python.

In a min-heap, which statement is always true?

Array Storage: No Pointers Needed

A complete binary tree maps perfectly onto an array. Node at index i has its left child at 2i+1 and right child at 2i+2. Parent is at (i-1)//2. No pointers. No heap allocations per node. The entire tree lives in a contiguous memory block.

The contiguous layout is not just elegant - it is the reason heaps dominate in practice. CPU caches prefetch sequential memory lines. A heap traversal from root to leaf touches O(log n) cache lines in a predictable pattern. A pointer-based BST with the same n elements may scatter nodes across memory, triggering O(log n) cache misses per path traversal.

PyTorch's internal priority scheduler for CUDA kernel dispatch uses exactly this array-based heap layout - the same representation that appears in Python's `heapq` module, but in C++.

In the array [5, 10, 15, 20, 25], what is the index of the parent of the element at index 3?

Push and Pop

Push (insert) - sift up

Push places the new element at the end, which keeps the complete-tree shape, then bubbles it upward with sift-up: swap with the parent until the heap property holds again. Each swap climbs one level, and the tree is at most log n levels deep, so the cost is O(log n).

Pop (remove min) - sift down

Pop is the subtler operation. Removing the root would leave a hole in a complete tree. The fix: move the last element to the root to preserve the shape, then sift it down to its correct position. This asymmetry, sift up on push and sift down on pop, is the core of the heap's O(log n) guarantee.

When popping from a min-heap, the last element is moved to the root because:

Heapify: Building a Heap in O(n)

The task: turn an arbitrary array into a valid heap.

The naive route is n individual pushes at O(log n) each, which is O(n log n).

The smart route, heapify, sifts down from the last internal node toward the root and finishes in O(n). Nodes near the leaves travel short distances; only the few nodes near the root sift far. Summing height times node-count, n times the sum of h / 2^h, converges to O(n) through the geometric series identity.

Python's `heapq.heapify(arr)` runs this same O(n) in-place build.

Heapify is O(n log n) because it calls sift_down n/2 times and each sift_down is O(log n).

Heapify is O(n). Nodes at height h number at most n/2^(h+1) and sift_down costs O(h) for each. Summing n * sum(h / 2^h) = O(n) because the series converges. Most nodes are leaves and sift zero levels.

Multiplying worst-case cost per node by total node count double-counts: leaf-level sift_down costs O(1), not O(log n). The tight analysis tracks height distribution.

The complexity of heapify (building a heap from an array):

Next Topics

The heap primitives - push, pop, heapify - are building blocks. The next lesson assembles them: Top-K elements in O(n log k), merging K sorted streams, and streaming median in O(log n) per element.

Heap Applications — Top-K, merge-K, streaming median
Dijkstra — Canonical graph application of min-heap

Heap Operations

Min-heap: parent <= children at every node. Root = global minimum. O(1) peek.
Array storage: left child at 2i+1, right at 2i+2, parent at (i-1)//2. Cache-friendly.
Push: append + sift-up. O(log n). Pop: last-to-root + sift-down. O(log n).
Heapify: O(n) - sift-down from n//2-1 to 0. Most nodes are leaves, sift zero levels.
Python: heapq (min-heap). Max-heap: negate values on push/pop.

Вопросы для размышления

The heap enforces a partial order - weaker than full sort. What would it cost to upgrade a heap to a fully sorted array? And is there a structure weaker than a heap that still gives O(1) minimum?
Heapify is O(n) even though it calls sift_down n/2 times. The intuition: most nodes are leaves. Can the same leaf-majority argument apply to other tree operations to get tighter bounds?
Python's heapq has no decrease-key operation. Dijkstra needs it (update a node's priority). How do production implementations work around this?

Связанные уроки

alg-07-heap-sort — Heap sort is a direct application of heapify + pop in a loop
alg-14-dijkstra — Dijkstra's priority queue is a min-heap over (distance, node) pairs
ds-15-heap-applications — Top-K, streaming median, and merge-K-lists all build on heap primitives
alg-01-big-o — The O(n) heapify result is non-obvious without amortized analysis intuition
alg-17-mst — Prim's MST algorithm uses a min-heap to extract the cheapest edge
ds-12-balanced-trees
os-04-scheduling
alg-05-merge-sort