Computational Complexity
Classic NP-Complete Problems
Цели урока
- Understand 3-SAT and its role as the compiler of reductions
- Read the 3-SAT <=_p CLIQUE reduction via the literal graph
- Distinguish Hamiltonian and Eulerian cycles - and why one is in P, the other NP-complete
- Explain the pseudo-polynomiality of DP for Subset Sum
Предварительные знания
Amazon, 2005. The routing team: 1,000 drivers, 50,000 addresses daily. Each driver covers an assigned list - that is Hamilton Path. The optimal solution is NP-hard. Amazon deployed the Lin-Kernighan heuristic and cut mileage by 15%, saving tens of millions of liters of fuel per year. NP-completeness did not stop the business - it dictated which algorithm shipped.
- **Logistics and routing** - TSP and Vehicle Routing are used by Amazon, UPS, and FedEx every day
- **Compilers** - Register Allocation = Graph Coloring (NP-complete), solved via Chaitin heuristics
- **Cryptography** - Subset Sum underlies some encryption schemes (Merkle-Hellman knapsack)
Cook, Levin, and the Birth of NP-Completeness
In 1971, Stephen Cook of the University of Toronto published 'The complexity of theorem-proving procedures', proving SAT is NP-complete and introducing the concept of NP-completeness itself. Independently in the USSR, Leonid Levin reached the same conclusions in 1973 - the result lives on as the Cook-Levin theorem. Cook won the Turing Award in 1982. In 1972, Richard Karp pinned NP-completeness on 21 specific problems - CLIQUE, Hamiltonian Cycle, Subset Sum among them - laying the foundation for everything covered in this lesson.
3-SAT
**3-SAT** is a restricted version of SAT where every clause holds **exactly 3 literals**. The restriction was supposed to make life easier. It did not - 3-SAT stays NP-complete. Yet **2-SAT** (two literals per clause) is in P. The complexity boundary lands precisely between 2 and 3.
3-SAT: formula phi = C1 AND C2 AND ... AND Cm, where each Ci = (l1 OR l2 OR l3). NP-completeness is proved by reduction SAT <=_p 3-SAT: any clause with k > 3 literals can be split into 3-literal clauses using auxiliary variables.
3-SAT is the workhorse of reductions. Most NP-completeness proofs route through a reduction from 3-SAT, because three literals per clause is a convenient structure for encoding constraints. 3-SAT is the "compiler" from logic to combinatorics.
| SAT variant | Constraint | Complexity |
|---|---|---|
| 1-SAT | 1 literal per clause | O(n) - trivial |
| 2-SAT | 2 literals per clause | O(n+m) - in P (SCC) |
| 3-SAT | 3 literals per clause | NP-complete |
| MAX-2-SAT | Maximize number of satisfied 2-clauses | NP-complete! |
| Horn-SAT | At most 1 positive literal per clause | O(n*m) - in P |
| XOR-SAT | XOR instead of OR | In P (system of linear equations mod 2) |
Why is 2-SAT in P, but 3-SAT NP-complete?
The Clique Problem (CLIQUE)
**CLIQUE**: given a graph G and a number k, does G contain a **complete subgraph** (clique) of size k - meaning k vertices with every pair connected by an edge? Translation for social networks: is there a group of k people who all know each other?
CLIQUE is NP-complete. The proof goes through 3-SAT <=_p CLIQUE: for a formula with m clauses, build a graph whose vertices are the literals in the clauses, with edges between compatible literals from different clauses. A clique of size m corresponds to a satisfying assignment.
Related problems: **Independent Set** (maximum subset with no edges) is the complement of CLIQUE. **Vertex Cover** (minimum subset covering every edge) is the complement of Independent Set. All three are NP-complete and chained together by simple reductions via the graph complement.
In a graph with 6 vertices and 15 edges (complete graph K6), what is the maximum clique?
Hamiltonian Path
**Hamiltonian Path/Cycle**: does the graph contain a path (cycle) that visits **every vertex exactly once**? Do not confuse it with the Eulerian path, which traverses every **edge** once - an Eulerian path is found in O(E), while Hamiltonian path is NP-complete. Same word "path," wildly different complexity.
Hamiltonian Cycle is NP-complete. Verification: given a sequence of vertices, check that it forms a cycle through all vertices - O(V). Finding one: brute force - O(V!) permutations. Key: Euler Cycle in P (all degrees even is necessary and sufficient), Hamilton Cycle NP-complete.
Practical impact: the Traveling Salesman Problem (TSP) hunts the shortest Hamiltonian cycle. TSP is one of the most studied NP-complete problems on the planet. Approximation algorithms exist: Christofides (1.5-approximation for metric TSP), the Lin-Kernighan heuristic. In practice, instances with tens of thousands of cities get solved.
| Problem | Type | Complexity | Practical approach |
|---|---|---|---|
| Euler Path | Every edge once | O(E) - in P | Hierholzer's algorithm |
| Hamilton Path | Every vertex once | NP-complete | Backtracking, DP (Held-Karp O(2^n*n^2)) |
| TSP | Shortest Hamilton Cycle | NP-hard | Christofides, LKH, branch-and-bound |
| Shortest Path | Shortest path A->B | O(V^2) - in P | Dijkstra, Bellman-Ford |
How does a Hamiltonian cycle differ from an Eulerian cycle?
Subset Sum
**Subset Sum**: given a set of numbers S = {s1, ..., sn} and a target t, does any subset of S add up to t? It looks deceptively simple, yet it is NP-complete. The interesting twist: dynamic programming delivers a **pseudo-polynomial** algorithm.
Subset Sum is NP-complete (via reduction from 3-SAT). Brute force: O(2^n) subsets. DP solution: O(n*t), where t is the target sum. This is **pseudo-polynomial** time: polynomial in the value of t, but exponential in the bit-length of t (log t bits).
Pseudo-polynomiality is subtle. The O(n*t) algorithm looks polynomial, but the input size is O(n*log(max_s)), and t can blow up exponentially (t = 2^n). When numbers stay small (t = poly(n)), DP works well. For large numbers, the problem stays hard.
| Problem | Type | NP-complete? | DP solution |
|---|---|---|---|
| Subset Sum | Numeric | Yes | O(n*t) pseudo-polynomial |
| Partition | Split into 2 equal parts | Yes | O(n*S/2) |
| Knapsack 0/1 | Maximize value | Yes | O(n*W) |
| Unbounded Knapsack | Unlimited items | Yes | O(n*W) |
| Coin Change | Minimum coins for a sum | In P (fixed coin set) | O(n*t) |
NP-complete problems show up everywhere: logistics, scheduling, compilers, bioinformatics. No fast exact algorithm exists for any of them (assuming P != NP). Approximation algorithms, heuristics, SAT solvers, and parameterized complexity carry the load instead. NP-completeness is not the end of the road - it is the start of the hunt for practical solutions.
NP-complete problems are impossible to solve in practice
For small inputs solvable by brute force, for medium inputs via DP, heuristics, and approximation algorithms. SAT solvers handle instances with millions of variables.
NP-completeness is a worst-case result. In practice, hard instances are rare. SAT solvers (MiniSat, CaDiCaL) use CDCL with clause learning and solve industrial instances. For TSP, metaheuristics find solutions within 1% of optimal. Subset Sum and Knapsack are solved by DP for numerically reasonable inputs.
The DP algorithm for Subset Sum runs in O(n*t). Why does this not prove P = NP?
Key Takeaways
- **3-SAT**: clauses with 3 literals - NP-complete, even though 2-SAT in P (a complexity phase transition)
- **CLIQUE**: finding a complete subgraph of size k - NP-complete via reduction from 3-SAT
- **Hamiltonian Cycle**: a path through every vertex - NP-complete; not to be confused with Eulerian (P)
- **Subset Sum**: a subset with a given total - NP-complete, but admits DP in O(n*t)
Related Topics
Classic NP-complete problems bridge theory and practice:
- NP-Completeness — Definitions of NP-completeness and reductions - the theoretical foundation
- The P and NP Classes — The base complexity classes - P, NP, verification
Вопросы для размышления
- Why does the complexity boundary fall precisely between 2-SAT and 3-SAT? What does the third literal add?
- Subset Sum has a DP solution in O(n*t). Are there analogous pseudo-polynomial algorithms for CLIQUE or Hamiltonian Cycle?
- SAT solvers handle formulas with millions of variables. Does that mean P = NP in practice?
Связанные уроки
- cplx-02 — NP-completeness and reductions - the theoretical foundation of this lesson
- cplx-01 — Base complexity classes P and NP, polynomial-time verification
- cplx-04 — Approximation algorithms are built on understanding NP-complete problems
- alg-05 — Dynamic programming is the core of pseudo-polynomial solutions for Subset Sum and Knapsack
- comp-04 — Reductions between languages - the same tool used for both computability and complexity
- alg-07 — Greedy algorithms and heuristics - the practical answer to NP-completeness
- ds-01 — Graph problems: many NP-complete problems are graph problems at heart
- alg-01-big-o