Algorithms

Counting Sort: O(n) Without Comparisons

Цели урока

Understand the O(n log n) lower bound for comparison sorts
Implement Counting Sort (stable version)
Understand Radix Sort and when to use it
Know Bucket Sort and its limitations

Предварительные знания

O(n²) sorts
Big O

O(n²) Sorts

O(n log n) is the limit for comparison sorts. But if we know something about the data, we can sort in O(n)! Counting, Radix, Bucket - three ways to beat the theoretical limit.

Sorting by age (0-150) - Counting Sort
Sorting phone numbers - Radix Sort
Sorting coordinates (uniform distribution) - Bucket Sort
Databases - indices on numeric fields

Harold Seward and counting sort at MIT

Counting sort was described by Harold Seward in 1954 in his master's thesis at MIT's Lincoln Laboratory. The same work also gave the first machine-oriented formulation of radix sort. This was an era when memory was measured in thousands of words and sorting was one of the central jobs of the earliest computers. Seward's idea looked almost heretical next to comparison sorts: don't compare elements at all, just count how many times each value occurs and rebuild the output straight from the counts. For integers in a narrow range that gave linear time and stepped around the O(n log n) bound entirely. Donald Knuth later worked through both counting and radix in volume 3 of The Art of Computer Programming (Sorting and Searching, 1973), cementing their place in the canon. Distribution counting, as Knuth called it, became the foundation for radix sort, which had already been used for decades to sort punched cards on electromechanical tabulating machines before electronic computers existed.

The Limit of Comparison Sorts

**Theorem:** any comparison-based sorting algorithm cannot be faster than O(n log n) in the worst case.

**log(n!) = Θ(n log n)** by Stirling's formula. This is the theoretical limit for comparison-based sorts.

**But!** If we know something about the data (value range, type), we can sort faster WITHOUT comparing elements.

Merge Sort, Quick Sort, Heap Sort - all O(n log n). This is because:

Counting Sort Idea

**Counting Sort** works when elements are integers in a known range [0, k].

Count how many times each value appears
Compute positions via prefix sum
Place elements at their correct positions

Counting Sort uses extra memory:

Counting Sort Implementation

**Simplified version** (not stable, but easier to understand):

**Stability matters** for Radix Sort! Without it, Radix Sort won't work correctly.

Why do we iterate right to left in the stable version?

When to Use Counting Sort

Characteristic	Counting Sort
Time	O(n + k)
Space	O(n + k)
Stability	Yes
In-place	No
Requirements	Non-negative integers 0..k

**When to use:**

k = O(n) - range is comparable to number of elements
Elements are non-negative integers (or can be shifted)
Stability needed
Memory is not critical

**When NOT to use:**

k >> n - e.g., sorting 100 numbers in range 0..10⁹
Elements are floats, strings, objects
Memory is limited

Sorting student grades (0-100) for 10,000 students. Best algorithm?

Radix Sort: Sorting by Digits

**Radix Sort** - sort by digits, from least significant (LSD) to most significant (MSD) or vice versa.

**Radix Sort complexity:** O(d × (n + k)) where d = number of digits, k = base (usually 10 or 256). When d = O(log n) we get O(n log n), but with better constants.

Radix Sort requires a stable sort on each pass because:

Bucket Sort: Distributing into Buckets

**Bucket Sort** - distribute elements into "buckets", sort each bucket, combine.

**Expected O(n) time** if elements are uniformly distributed. With poor distribution - O(n²) (all in one bucket).

Algorithm	Time	Space	Use Case
Counting Sort	O(n+k)	O(n+k)	Integers 0..k, k=O(n)
Radix Sort	O(d(n+k))	O(n+k)	Integers, fixed-length strings
Bucket Sort	O(n) avg	O(n)	Uniformly distributed floats

Bucket Sort has expected O(n) if:

Non-Comparison Sorts

Comparison sorts: lower bound O(n log n)
Counting Sort: O(n+k), for integers 0..k
Radix Sort: O(d(n+k)), sort by digits
Bucket Sort: O(n) avg, for uniform distribution
All require knowledge about the data
Trade-off: time vs memory vs universality

Вопросы для размышления

Under what conditions does counting sort become impractical despite its O(n) complexity?
How is counting sort used as a subroutine in radix sort?
What data processing tasks benefit from knowing the value range in advance?

Связанные уроки

alg-05-merge-sort — Merge Sort is O(n log n) and universal, Counting Sort is O(n+k) but requires integers in a known range - different trade-offs
alg-38-radix-sort — Counting Sort is the building block of Radix Sort: a stable pass over a single digit
ds-06-hash-intro — Bucket Sort uses the same idea of slot distribution as a hash table: an element lands in a bucket based on its value
alg-01-big-o — The O(n log n) lower bound for comparison sorts is proved via a decision tree - a great example of using Big O to prove an impossibility
alg-04-basic-sorts — O(n^2) sorts are universal and in-place, Counting Sort is faster but only for specific kinds of data
prob-02-combinatorics