Computational Complexity
Approximation and Inapproximability
UPS saves millions of dollars daily using approximation algorithms for routing. The exact TSP solution for 21 million packages is physically impossible to compute, but Christofides guarantees a route no worse than 1.5x the optimum. Approximation algorithms are not a compromise. They are the science of how much can be achieved in polynomial time.
- **UPS ORION** (On-Road Integrated Optimization and Navigation) uses approximation algorithms to route 55 000 drivers - saving 10 million gallons of fuel per year
- **VLSI chip design** uses approximation algorithms for placement and routing: exact solutions are NP-hard, but approximations with quality guarantees are sufficient for production
- **Scheduling in cluster computing** (distributing jobs across servers) is solved by PTAS algorithms with guaranteed deviation from the optimal makespan
Approximation Ratio
UPS routes 21 million packages every day. The optimal route is the Travelling Salesman Problem - NP-hard. No exact solution exists, but a good approximation is sufficient: the Christofides algorithm finds a route no more than 1.5 times longer than the optimum. This is the **approximation ratio alpha**: the ratio of the found solution's cost to the optimal cost. For minimization, alpha >= 1 (found no worse than alpha * OPT). For maximization, alpha <= 1 (found no worse than alpha * OPT).
An approximation algorithm guarantees its ratio on every input, not just on average. Classic results: Vertex Cover has a 2-approximation (greedy); Set Cover has an O(log n)-approximation (also greedy); metric TSP has a 1.5-approximation (Christofides, 1976). In 2021, Karlin, Klein, and Gharan improved Christofides to 1.5-epsilon for metric TSP.
An algorithm finds a solution to a minimization problem with cost 150, while the optimum is 100. What is the approximation ratio?
PTAS: Polynomial-Time Approximation Scheme
A fixed ratio of 1.5 or 2 is often good enough. But sometimes arbitrarily accurate approximation is needed. A **PTAS (Polynomial-Time Approximation Scheme)** is a family of algorithms A_epsilon that, for any epsilon > 0, finds a solution with ratio (1+epsilon) in polynomial time (in n). Formally: for each fixed epsilon the algorithm runs in poly(n), but the polynomial degree may depend on epsilon. The Knapsack problem has a PTAS with time O(n^2 / epsilon) - for epsilon=0.01, the result is within 1% of optimal.
PTAS results exist for many geometric problems: Euclidean TSP (Arora, 1998 - O(n log n) for any epsilon), Euclidean Steiner Tree, k-means clustering. For combinatorial problems PTAS is rarer: Knapsack and Bin Packing have PTAS. Metric TSP has only a 1.5-approximation (not PTAS) - no PTAS exists if P != NP by Papadimitriou-Vempala.
A PTAS with epsilon=0.1 solves the problem in 1 second. Why can't the exact solution be obtained simply by letting epsilon approach zero?
FPTAS: Fully Polynomial-Time Approximation Scheme
A PTAS with time O(n^(1/epsilon)) or O(2^(1/epsilon) * n) is practically useless at epsilon = 0.001 - the exponent grows exponentially. **FPTAS** eliminates this problem: running time is polynomial in both n and 1/epsilon. Formally: O(poly(n, 1/epsilon)). Knapsack has an FPTAS with time O(n^2 / epsilon) - at n=1000 and epsilon=0.001 that is 10^12 operations, already borderline practical. Subset Sum also has an FPTAS. FPTAS is a strictly stronger notion than PTAS.
FPTAS implies PTAS, but not vice versa. If a problem is strongly NP-hard, then FPTAS does not exist if P != NP. TSP, 3-Colorability are strongly NP-hard and have no FPTAS. Knapsack is weakly NP-hard (NP-hardness disappears under unary encoding), so FPTAS is possible. This is a mathematically elegant distinction: weakly NP-hard = FPTAS may exist.
Why can strongly NP-hard problems not have an FPTAS (assuming P != NP)?
Unique Games Conjecture and Approximation Limits
In 2002, Subhash Khot formulated the **Unique Games Conjecture (UGC)**: the Unique Label Cover problem is NP-hard even to approximate. This is a conjecture - not a theorem - but it reshaped approximation theory. From UGC follow precise lower bounds: Max-Cut cannot be approximated better than 0.878... (exactly what the Goemans-Williamson algorithm achieves!); Vertex Cover cannot be approximated better than 2-epsilon. In other words: decades-old 'best known' algorithms turned out to be optimal under UGC.
Khot, Kindler, Mossel, O'Donnell (2007) proved that GW-optimality for Max-Cut follows from UGC. Prasad Raghavendra (2008) showed that for a broad class of CSP problems, SDP-approximation is optimal under UGC. UGC is equivalent to saying that SDP-relaxation is the right tool for all 'symmetric' optimization problems. UGC remains an open problem: Subhash Khot received the Nevanlinna Prize (the CS analog of the Fields Medal) in 2014.
If a problem is NP-hard, it can always be approximated arbitrarily well given enough time
Many NP-hard problems are inapproximable: it is proven (conditionally or unconditionally) that even a (1+epsilon)-approximation is polynomially impossible
Inapproximability is proved via the PCP theorem and gap-preserving reductions. For example, independently of P vs NP: Max-Clique cannot be approximated within n^(1-epsilon) for any epsilon > 0 - this is an unconditional result from PCP theory.
What does it mean that the Goemans-Williamson algorithm for Max-Cut is optimal under UGC?
Key Ideas
- **Approximation ratio** = ALG/OPT (minimization) or OPT/ALG (maximization); guaranteed on all inputs, not just on average
- **PTAS** gives (1+epsilon)-approximation in poly(n) for any fixed epsilon; **FPTAS** in poly(n, 1/epsilon); strongly NP-hard problems cannot have FPTAS
- **UGC** establishes precise lower bounds on approximability: the Goemans-Williamson algorithm (0.878 for Max-Cut) is optimal under UGC
Related Topics
Approximation theory builds on NP-theory and connects to randomized algorithms:
- NP and NP-Completeness — Approximation algorithms are needed precisely for NP-hard problems; inapproximability is proved through NP-hardness of gap versions
- Parameterized Complexity — An alternative approach to NP-hard problems: FPT algorithms give exact solutions in f(k)*poly(n) time rather than an approximation
Вопросы для размышления
- Vertex Cover has a 2-approximation, but UGC predicts that better than 2-epsilon is impossible. Does this mean the greedy algorithm for Vertex Cover is the 'right' solution to the problem?
- The k-means clustering problem is NP-hard, but a PTAS exists. Why do practitioners use Lloyd's algorithm without theoretical guarantees - and how justified is this?
- The PCP theorem shows that NP-hardness and inapproximability are closely connected. What does this say about the nature of computational complexity - is approximation complexity a 'separate' theory or part of the same framework?