Computability
Turing Machine: Formal Definition
Цели урока
- Name the seven components of a formal TM definition and explain the role of each
- Read and write configurations (IDs) and trace a computation chain
- Reproduce the proof sketch for the undecidability of the Halting Problem
- Distinguish decidable, recursively enumerable, and unrecognizable languages
Предварительные знания
Alan Turing, 1936. One paper. Twelve pages. Before computers existed, Turing proved that certain problems cannot be solved algorithmically - at all. The LLVM compiler, ChatGPT inference engine, Python interpreter - all implement the same Turing machine. And none of them can solve the Halting Problem.
- **LLVM / GCC compilers** - every compiler implements a TM: reads source code (tape), applies grammar rules (transitions), outputs machine code (result)
- **Python interpreter** - the bytecode dispatch loop directly corresponds to the transition function delta: current opcode + stack = next state
- **Static code analysis** - tools like mypy and tsc are bounded by the Halting Problem: they cannot prove correctness of an arbitrary program
- **Formal verification** - Coq, Lean, Isabelle use TM configurations to prove algorithm correctness
- **Quantum computers** - a quantum TM is faster for certain problems but does not solve undecidable problems - the limits of computability are identical
Turing 1936 - the paper that changed mathematics
In 1936, Alan Turing published 'On Computable Numbers, with an Application to the Entscheidungsproblem'. He was 24. The paper resolved Hilbert's decision problem - the answer was 'no'. Alonzo Church independently proved the same result via lambda calculus. Together they formulated the Church-Turing thesis: all reasonable formalizations of computation are equivalent. Computers did not yet exist - Turing invented their concept in order to prove a theorem about the limits of mathematics.
Formal Definition of a TM
**Alan Turing, 1936. Twelve pages. Before electronic computers existed.** Turing described a device with an infinite tape and a read/write head - and proved it computes everything that is computable in principle. LLVM compiles C++ to assembly. The Python interpreter executes bytecode. Both implement the same abstraction. Seven components that cannot be simplified without losing power.
**A Turing machine** is a 7-tuple M = (Q, Sigma, Gamma, delta, q0, q_accept, q_reject), where: Q is a finite set of states, Sigma is the input alphabet (excluding blank), Gamma is the tape alphabet (Sigma subset of Gamma, blank in Gamma), delta: Q x Gamma -> Q x Gamma x {L, R} is the transition function, q0 is the start state, q_accept is the accept state, q_reject is the reject state.
The minimalism is intentional. Finite memory (states) + infinite tape (data) + head (addressing) + transition table (program). Remove any component and the model weakens. Add a second head - power does not increase. Turing found the minimum.
| TM Component | Analog in a Real Computer | Key Difference |
|---|---|---|
| States Q | CPU registers / instruction pointer | Finite set |
| Tape Gamma | RAM + disk | Infinite in both directions |
| Head | Memory addressing | Single position, shifts by 1 |
| delta (transitions) | Program (microcode) | Lookup table, not arbitrary code |
| q_accept / q_reject | exit(0) / exit(1) | Two terminal states |
What happens when there is no entry in delta for the current (state, symbol) pair?
Configurations and Computation History
A snapshot of a running machine at time t is a **configuration** (instantaneous description, ID). Three elements: tape contents, head position, current state. A computation is a chain of configurations connected by the yields relation. This is not metaphor - this is how TM correctness is proved in Coq and Lean.
A configuration is written as the string uqv, where u is the tape content to the left of the head, q is the current state, v is the symbol under the head and everything to its right. For example: 101 q3 10 means state q3, head on the first character of '10', with '101' to the left.
An accepting computation is a chain ending in a configuration with q_accept. A rejecting one ends in q_reject. A looping computation is an infinite chain. This distinction is critical: the third case is precisely why the Halting Problem is undecidable.
In the configuration 'ab q5 cd', the head is currently positioned on the symbol:
The Halting Problem - First Proof of Uncomputability
**1936. Turing proves: no program can determine whether an arbitrary program on an arbitrary input will halt.** Not because computers are bad. Not because the problem is hard. For fundamental logical reasons. Static analyzers, debuggers, verifiers - all of them run into this wall.
**Halting Problem**: given a TM M and input w, does M halt on w? Formally: HALT = {<M, w> | TM M halts on input w}. Turing proved: HALT is undecidable - no TM H exists that always correctly answers yes or no.
This is not a programming trick - it is the logical structure of Cantor's diagonalization. The same method Cantor used to prove the uncountability of real numbers. Turing applied it to computation. The practical consequence: no static analyzer can guarantee termination of an arbitrary program.
| Problem | Question | Reduces to |
|---|---|---|
| Halting Problem | Does M halt on w? | Base undecidable |
| Equivalence Problem | Are M1 and M2 equivalent? | Halting |
| Rice's Theorem | Non-trivial property of L(M)? | Halting |
| Post Correspondence | Does a matching sequence exist? | Halting |
Why does D in the proof receive its own description <D> as input?
The Language of a Turing Machine
Every TM **defines a language** - the set of strings it accepts. This is the bridge between computability theory and formal language theory. The same bridge a compiler's lexer and parser use to determine program correctness.
**The language of machine M** is L(M) = {w in Sigma* | M(w) = accept}. The set of all strings on which M halts in the accept state. Strings on which M rejects or loops are not in L(M).
A critical distinction: **reject and loop** differ from the observer's perspective but are identical from the language perspective (the string is not in L(M)). This asymmetry is what generates the difference between decidable and recursively enumerable languages - the topic of the next lesson.
| Language Class | Recognizer | Halting Guarantee |
|---|---|---|
| Regular | DFA/NFA | Always halts |
| Context-free | PDA | Always halts |
| Decidable (recursive) | TM (decider) | Always halts |
| Recursively enumerable | TM (recognizer) | May loop on 'no' inputs |
| Unrecognizable | No TM exists | - |
A TM is the simplest device (tape + head + rules) whose language power exceeds all finite and pushdown automata. Yet even a TM cannot recognize **all** languages - unrecognizable languages exist, as demonstrated in the next lesson.
A Turing machine is an abstract artifact with no practical relevance for programmers
A TM defines the fundamental limits of computability for ALL computers. Any laptop, supercomputer, or quantum computer computes no more than a TM (Church-Turing thesis)
If a problem is undecidable on a TM, it is undecidable everywhere. This is not a limitation of the model - it is a property of mathematical reality. No hardware improvement or new programming language can cross this boundary.
TM M has been running for 10 minutes without halting. What can be concluded?
Key Ideas
- **TM = (Q, Sigma, Gamma, delta, q0, q_accept, q_reject)** - seven components fully define the machine's behavior
- **A configuration (ID)** captures the complete snapshot: tape, head position, current state. A computation is a chain of configurations
- **The Halting Problem is undecidable** - proved via self-application and diagonalization: H(<D, <D>>) leads to contradiction
- **L(M)** is the set of strings on which the TM halts in q_accept
- **Reject vs loop** - different outcomes for the observer, but both mean 'not in the language'
- The power comes from minimalism: adding a second head or tape does not expand the class of computable functions
Related Topics
The Turing machine is the central concept linking computability, formal languages, and complexity:
- Computable and Uncomputable Functions — TM is the formal model through which computable functions are defined
- Decidable and RE Languages — Language classification based on TM behavior (halts / loops)
- P and NP Classes — Complexity is measured in TM steps - direct extension of this model
Вопросы для размышления
- Why did Turing choose a tape and head rather than another model? What tape properties make the model sufficiently powerful?
- If a TM were given a second tape or a second head, would it become more powerful - could it solve more problems?
- How are TM limits related to Godel's incompleteness theorem? Why did both results appear at nearly the same time?
Связанные уроки
- comp-01 — Computable function concept is defined in the previous lesson
- comp-03 — Language classification (RE, recursive) builds on TM behavior
- cplx-01 — Algorithm complexity is measured in TM steps
- aut-01-fsm — FSM is a TM restricted to finite tape
- alg-01-big-o — Big-O counts steps in a TM computation
- fl-18-turing-machine