Computability
Rice's Theorem
1951. Henry Gordon Rice, a graduate student at Syracuse University, publishes a two-page result. The conclusion: no program can check any non-trivial property of another program. Not a single one. Not whether it halts, not what it computes, not whether it has a bug. Seventy years later, GitHub Copilot generates code but cannot verify its own output. GPT-4 writes functions but cannot guarantee their correctness. Claude analyzes programs but cannot give a precise answer to a semantic question. Not because these models are poorly trained. Because Rice's theorem 1951 + Turing's theorem 1936 = a mathematical prohibition.
- GitHub Copilot cannot verify its own output - undecidability in production: any attempt at precise semantic checking runs into Rice's theorem
- Kaspersky antivirus: heuristics = intentional incompleteness forced by Rice's theorem; precisely determining 'is this program malicious' is undecidable
- LLVM constant folding works because it is a syntactic (structural) property, not semantic - Rice's theorem does not apply
- Formal verification (TLA+, Coq): works only for finite state spaces or with explicit termination proofs from the programmer
Semantic Properties of Programs
1953. Rice's paper appears in the Transactions of the American Mathematical Society. First step: formalize what 'a property of a program' means. A syntactic property is visible in the text: code length, presence of a while loop, number of functions. A semantic property depends on what the program computes: does it halt on all inputs, does it return only even numbers, is it equivalent to another program. The boundary is fundamental. Syntactic properties are decidable - a parser handles them. Semantic ones are not.
**Formal definition via index sets.** Let $\varphi_i$ be the program with index $i$ (under a Godel numbering). Property P defines the set $\mathcal{C} = \{i \mid \varphi_i \text{ has property P}\}$. P is **semantic** if $\mathcal{C}$ is an index set: $\varphi_i = \varphi_j \Rightarrow (i \in \mathcal{C} \Leftrightarrow j \in \mathcal{C})$. Two programs computing the same function either both have the property or neither does.
Quick test for whether a property is semantic: ask whether two programs with identical behavior on all inputs necessarily agree on the property. If yes - semantic. If no (one uses a loop, the other recursion, same result) - syntactic.
Property P: 'the program contains at least one recursive function'. Syntactic or semantic?
Non-Triviality and the Proof of Rice's Theorem
Second step: non-triviality. Property P is **trivial** if it holds for all programs or for none. Trivial properties are decidable - the answer is always the same, the algorithm is: 'always return true' or 'always return false'. A non-trivial property: at least one program has it, at least one does not. **Rice's theorem**: if P is a non-trivial semantic property of programs, then the problem 'does program M have property P' is undecidable.
The proof assumes P(empty_function) = false. If P(empty_function) = true, the construction is symmetric: take M_no with P(M_no) = false, build G_Mx that behaves like the empty function if M(x) halts and like M_no otherwise. Both cases reduce to HALT.
Property P: 'the program halts on input 42'. In Rice's proof we build G_Mx. If M(x) halts, G_Mx behaves like M_yes. Why does it follow that P(G_Mx) = P(M_yes)?
Extensions: Rice-Shapiro and Godel's Incompleteness
Rice's theorem is a beginning, not an end. Rice-Shapiro (1959) refines it: for semi-decidable (RE) properties of partial recursive functions, semi-decidability holds if and only if the property is monotone and compact. Intuition: one can verify what a function does on a finite number of inputs - not on all inputs. In parallel: Godel's incompleteness theorems (1931) and Rice's theorem (1951) are two manifestations of the same limitation. Godel: in a formal system there are truths without proof. Rice: in computation there are properties without a checking algorithm.
**Rice's theorem for oracles.** Adding an oracle for the Halting Problem H, Rice's theorem still holds for properties of H-computable functions: every non-trivial property of H-computable functions is undecidable relative to H. Post's hierarchy builds an infinite ladder: each level has undecidable properties not solvable by the oracle of the previous level.
Property P: 'the function returns 0 for at least one input'. Why is this property semi-decidable (by Rice-Shapiro) even though Rice's theorem states undecidability?
Applications: Antiviruses, Verifiers, Neural Networks
Rice's theorem is not an academic artifact - it explains architectural decisions in real systems. An antivirus cannot precisely detect maliciousness - instead it uses heuristics and signatures (false negatives are unavoidable). The LLVM static analyzer finds ~60% of real bugs at the cost of false positives: soundness is chosen over completeness. Type checkers (TypeScript, mypy) are sound: they reject some correct programs but accept only safe ones. None of these tools violates Rice's theorem - all operate with a restricted class of properties or with approximations.
**LLVM and constant folding.** Constant folding (`2 + 3 -> 5` at compile time) works despite Rice's theorem - because it is a syntactic transformation. The compiler sees two numbers and an operator in the AST - that is structure, not behavior. No semantics needed. Rice's theorem forbids only semantic checks; structural transformations remain decidable.
Practical diagnostic: does the tool check something that cannot be seen without running the program? If yes - it is either incomplete or not always precise. This is not a bug in the tool - it is a mathematical consequence of Rice's theorem.
Rice's theorem means programs cannot be verified. Formal verification, type checking, and static analysis are useless.
Rice's theorem forbids exact solutions for arbitrary programs. Verification works by restricting scope: finite state spaces (TLA+), restricted type systems (Hindley-Milner), bounded model checking, sound-but-incomplete approximations.
A ban on a universal precise algorithm does not ban algorithms for sub-problems. Coq proves properties of programs - but requires explicit termination proofs from the programmer. TLA+ verifies protocols - but only with a finite number of states. Rice's theorem says: a complete and precise verifier for arbitrary programs is impossible. Partial, bounded, approximating ones - are possible and useful.
TypeScript's type checker rejects some correct programs (type false positives). How does this relate to Rice's theorem?
Key Ideas
- **Semantic property** - depends on what a program computes (the function), not on its syntactic structure
- **Non-triviality** - property P is non-trivial if at least one program has P and at least one does not
- **Rice's theorem** - every non-trivial semantic property of programs is undecidable; proved by reduction from the Halting Problem
- **Practical takeaway** - static analyzers, antiviruses, and compilers work with approximations: soundness vs completeness is a tradeoff forced by mathematics
Вопросы для размышления
- An IDE highlights type errors - does this violate Rice's theorem? Why is type checking decidable when semantic checking is not?
- LLVM finds dead code in simple cases. How does this relate to Rice's theorem - which says this is impossible?
- Formal verification (Coq, TLA+) works and proves properties of programs. Where exactly is the boundary between what verification can and cannot do?