Algorithms
O(n^2) Sorts: Simple but Slow
Цели урока
- Understand and implement Bubble, Selection, and Insertion Sort
- Know when O(n^2) is acceptable and why Timsort embeds Insertion Sort
- Distinguish stable from unstable sorting
- Pick the right algorithm based on data characteristics
Предварительные знания
- Big O notation
- Basic programming
2003. Tim Peters writes Timsort for Python 2.3 and buries Insertion Sort inside an O(n log n) algorithm. Java 7 copies the approach. Why? Because on real-world data, O(n^2) with tiny constants smokes O(n log n) with recursion overhead. Knowing 'slow' algorithms means knowing exactly when they stop being slow.
- **Timsort** (Python, Java) uses Insertion Sort for subarrays under 64 elements
- **Embedded systems** with tight memory budgets - in-place O(1) sorts are mandatory
- **Algorithm visualizations** - Bubble Sort makes loop invariants visually obvious
- **Coding interviews** - all three algorithms show up regularly on coding screens
Historical context
John von Neumann invented Merge Sort in 1945, but before fast O(n log n) algorithms arrived, programmers in the 1950s and 1960s lived on these simple O(n^2) approaches. Bubble Sort first appeared in Iverson's 1962 textbook. Selection Sort and Insertion Sort were the workhorses of punch-card sorting - physical media made minimizing swaps a concrete engineering concern, not just a theoretical one.
Why Learn Slow Algorithms?
2003. Python 2.3 ships with a brand-new sort algorithm - Timsort. Tim Peters built it around Insertion Sort for small subarrays. The sort() call gets 1.5-2x faster on real-world data. The lesson: an O(n^2) algorithm living inside an O(n log n) one is not a design flaw - it is a deliberate optimization.
- **Simplicity** - minimal code, maximum loop-invariant clarity
- **Patterns** - nested loops, swap, sentinel, early exit
- **Small data** - on n < 32 elements O(n^2) wins due to cache locality
- **Interviews** - Insertion Sort shows up in every second coding screen
| Algorithm | Best | Average | Worst | Memory |
|---|---|---|---|---|
| Bubble Sort | O(n) | O(n^2) | O(n^2) | O(1) |
| Selection Sort | O(n^2) | O(n^2) | O(n^2) | O(1) |
| Insertion Sort | O(n) | O(n^2) | O(n^2) | O(1) |
Python Timsort and Java Arrays.sort both use Insertion Sort for subarrays under 64 elements - it beats merge-based approaches there because of low overhead and cache-friendly sequential access.
Timsort (Python, Java) uses Insertion Sort for small subarrays. Why not apply O(n log n) throughout?
Bubble Sort: Heavy Elements Rise
Picture a line of students ordered by height - each pass compares neighbors and swaps if the left one is taller. Tall students bubble rightward. After one pass the tallest is last. After k passes, the k tallest are locked in place.
Bubble Sort is the slowest of the O(n^2) family: it racks up swaps aggressively and has poor cache locality. Educational use only - never ship it.
The swapped-flag optimization improves which case?
Selection Sort: Pick the Minimum
Strategy: scan the unsorted portion, grab the minimum, drop it into position. Exactly n-1 swaps total - that is the defining property. When each swap is costly (large objects, flash storage writes), Selection Sort wins.
Selection Sort's strength: exactly n-1 swaps regardless of input order. When swaps are expensive - large objects in memory, writes to flash - this beats Bubble Sort's O(n^2) swap count.
Selection Sort performs how many swaps on an array of 10 elements?
Insertion Sort: Slot Each Element In
Card game intuition: pick a card from the deck and slide it into the right spot in the already-sorted hand. Insertion Sort does exactly that - each new element shifts larger ones rightward and drops into place. On data with k inversions it runs in O(n + k).
Insertion Sort is the only online algorithm of the three: it sorts data as it arrives without needing the full array upfront. Bubble and Selection require the entire input.
On nearly-sorted arrays (log n inversions), Insertion Sort delivers O(n log n) total work - matching Quick Sort, but without recursion overhead.
Insertion Sort runs in O(n) when:
Comparison: When to Use Which
| Criterion | Bubble | Selection | Insertion |
|---|---|---|---|
| Code simplicity | ★★★ | ★★★ | ★★☆ |
| Best case | O(n) | O(n^2) | O(n) |
| Swap count | O(n^2) | O(n) | O(n^2) |
| Stable | Yes | No | Yes |
| Online | No | No | Yes |
| Nearly sorted | Good | Poor | Excellent |
Stability - whether the algorithm preserves the relative order of equal elements. Critical in multi-key sort pipelines.
In real projects, reach for the built-in: sorted() in Python, Arrays.sort() in Java, std::sort() in C++. They are benchmarked, production-hardened, and O(n log n).
Need to sort 1000 events that are mostly in chronological order, with a handful out of sequence. Best pick?
O(n^2) Sorts
- Bubble: swap neighbors, large values bubble up; best case O(n) with swapped flag
- Selection: find min, place it; always exactly n-1 swaps
- Insertion: slot into sorted prefix; O(n + k) on k-inversion data
- All O(n^2) worst case, O(1) extra memory
- Insertion is best for nearly-sorted data - Timsort uses it internally
- In production, use built-in sorted() / Arrays.sort()
Next Steps
O(n^2) breaks down on large data. The next level: O(n log n) sorts.
- Merge Sort — Divide and Conquer, O(n log n), stable
- Quick Sort — Fastest in practice, O(n log n) average
- Heap Sort — In-place O(n log n), O(1) memory
Связанные уроки
- alg-01-big-o — Big O notation is essential for understanding the O(n^2) vs O(n log n) trade-off
- alg-05-merge-sort — After O(n^2), Merge Sort's O(n log n) feels like a genuine revelation
- alg-06-quick-sort — Quick Sort's partition idea directly contrasts with bubble's pairwise swaps
- alg-03-amortized — Amortized analysis explains why Insertion Sort on nearly-sorted data approaches O(n)
- alg-38-radix-sort — Radix Sort sidesteps the comparison lower bound entirely - direct contrast to O(n^2)
- ds-01-arrays