Automata and Cognition
Finite State Machines - The Language of States
Цели урока
- Understand that FSM = states + inputs + transitions, and where it applies
- Know the formal DFA definition (Q, Σ, δ, q₀, F) and NFA/DFA equivalence
- See the link between regular expressions and FSMs via Kleene's theorem
- Recognize dangerous regex patterns (ReDoS) and how to defend against them
Предварительные знания
- Basic understanding of sets and functions
- Basic programming experience (any language)
- Helpful: intuition about regular expressions
TCP connections are FSMs. Regular expressions are FSMs. NPCs in Doom are FSMs. Elevators are FSMs. One concept from 1943 (McCulloch-Pitts) still underlies the digital world.
- **TCP/IP stack (RFC 793, 1981)** - 11 states per connection, the diagram hasn't changed in 44 years
- **Cloudflare ReDoS (2019)** - one regex with nested quantifier took 1.1.1.1 down for 27 minutes
- **Doom (1993)** - all enemies run on FSMs with patrol/chase/attack/flee states
- **Stack Overflow (2016)** - 34 minutes downtime due to regex with exponential backtracking
- **React state machines (XState)** - modern library for UI components based on FSM
McCulloch and Pitts - The First Model of an Artificial Neuron
In 1943 neurophysiologist McCulloch and logician Pitts published "A Logical Calculus of Ideas Immanent in Nervous Activity" - the first formal model of a neuron as a logical element. From this paper grew two branches: 1. formal grammars and automata theory through Kleene and Chomsky 2. artificial neural networks through Rosenblatt (perceptron 1958) and Hinton (deep learning 2006). One paper - the root of both FSMs and modern AI.
What Is a Finite State Machine
**Every time a webpage opens, a TCP handshake happens. And every TCP connection is a finite state machine with 11 states: CLOSED → LISTEN → SYN_SENT → SYN_RECEIVED → ESTABLISHED → FIN_WAIT_1 → ... In RFC 793 (1981) this automaton was drawn as a diagram - and 44 years later nothing has changed.** Finite state machines are not academic abstractions. They're the basic language the digital world's infrastructure is written in.
**A finite state machine (FSM)** is a behavior model with 4 components: 1. a finite number of **states** 2. **input alphabet** (what can arrive) 3. **transition function** (which input → which state) 4. **initial state**. No memory beyond the current state.
Where FSMs work right now
| Where | States | Scale |
|---|---|---|
| TCP/IP stack | 11 states per connection | ~10^10 active connections worldwide |
| Regex engines (NFA/DFA) | Pattern-dependent | Form validation, traffic parsing - billions per second |
| Compiler lexers | States for numbers/strings/identifiers | Every compilation of every program |
| NPCs in games (Doom 1993, GTA V) | Patrol → Chase → Attack → Flee | Millions of game sessions |
| Vending machines, ATMs, elevators | Idle → Selecting → Dispensing | Physical world |
| UI components (React state) | Loading → Success → Error | Every web app |
Simple example - subway turnstile
**2 states, 2 inputs (`card`, `push`), 4 transitions.** The turnstile doesn't know how many people passed through it, who passed, at what time. Only the current state and the reaction to the next input. That's the essence of FSM - **complete behavior is determined by the current state**, history is not needed.
FSM is just a bunch of `if/else` statements in code
FSM is an explicit model where states and transitions are visible and formally verifiable
A pile of `if/else` typically drags implicit state through dozens of flags and variables - hard to test, legions of bugs. FSM makes all possible states explicit. A diagram can be drawn, correctness proved, unreachable states found. TLA+ and SPIN automatically check such models for properties like "the turnstile never lets anyone pass without payment".
What distinguishes an FSM from a regular program with variables?
Formal Definition and DFA vs NFA
**Formally a DFA (Deterministic Finite Automaton) is a 5-tuple** `M = (Q, Σ, δ, q₀, F)`:
- **Q** - finite set of states (e.g. `{LOCKED, UNLOCKED}`)
- **Σ** - input alphabet (`{card, push}`)
- **δ: Q × Σ → Q** - transition function (gives new state from current + input)
- **q₀ ∈ Q** - initial state (`LOCKED`)
- **F ⊆ Q** - set of accepting states (for string recognizers)
**DFA (deterministic)**: for each input in each state - exactly one transition. **NFA (nondeterministic)**: one input can have 0, 1, or several simultaneous transitions. NFAs are easier to design, DFAs run faster. The famous Rabin-Scott theorem (1959) proves any NFA can be converted to an equivalent DFA - though with worst-case exponential blow-up in number of states.
Example: recognize strings ending in 'ab'
**Cost of determinism:** when converting NFA → DFA the state count can grow as 2^n. That's why real regex engines (Go regexp, Google's RE2) use NFA directly with simulation of all possible states in parallel - guaranteed O(n×m) instead of PCRE's exponential blow-up.
NFA is more powerful than DFA - can recognize more languages
NFA and DFA are equivalent in expressive power - they recognize exactly the same class of languages (regular)
This is the famous 1959 result of Rabin-Scott, for which they got the Turing Award in 1976. NFA is easier to write and often more compact, but any NFA converts to a DFA (subset construction). Real power comes only from adding memory - a stack (PDA = pushdown automaton) or a tape (Turing machine).
A DFA with 10 states is converted to an equivalent NFA. How many states can the NFA have at best?
Regular Expressions Are FSMs
**Writing `/^[a-z]+@[a-z]+\.[a-z]+$/` - under the hood an FSM gets built.** Each character in the regex turns into states and transitions per Kleene's theorem (1956). This is the most massive practical use of FSMs in industry: every form validation, every grep, every log parser.
Operator-to-FSM correspondence
| Regex | FSM construction | Example |
|---|---|---|
| `a` | One transition on character 'a' | q0 --a--> q1 |
| `a|b` | Alternative - two paths from one state | From q0 either a or b |
| `a*` | Loop with skip option | q0 --a--> q0 (self-loop) |
| `ab` | Concatenation - sequential transitions | q0 --a--> q1 --b--> q2 |
| `a+` | One or more = a · a* | Min one transition + loop |
| `[a-z]` | One transition with large alphabet | 26 parallel transitions |
The famous ReDoS case
**Cloudflare, July 2, 2019. 13:42 UTC. One new regex in a WAF rule pegs CPU on every server to 100% in 10 minutes. The 1.1.1.1 (DNS) site is unavailable for 27 minutes.** The cause - pattern `(?:(?:"|'|\]|\}|\\|\d|...|\?)+)+` created an NFA where one input spawns an exponential number of simultaneous paths. This is ReDoS - Regular Expression Denial of Service.
**ReDoS defense:** 1. use NFA-simulation engines - RE2 (Google), Rust regex, Go regexp. 2. forbid nested quantifiers in user-input regex. 3. put a timeout on pattern execution. Stack Overflow went down for 34 minutes in 2016 because of such a regex - the public postmortem is worth reading.
Key takeaways
- FSM = states + inputs + transitions. Complete behavior is determined by the current state
- FSMs are everywhere: TCP, regex, lexers, NPCs, UI - the same machinery underneath
- DFA (one transition per input) and NFA (several) are equivalent in expressive power (Rabin-Scott 1959)
- Regular expressions = NFAs by Kleene's theorem (1956). All engines build FSMs under the hood
- Without memory, an FSM cannot count. Recognizing `aⁿbⁿ` requires a PDA (stack), and arbitrary languages require a Turing machine
What's next in the course
From simple FSMs to models with memory and probability.
- Markov chains — FSM with transition probabilities - text models, recommendations, PageRank
Key Concepts
- FSM is a 5-tuple (Q, Σ, δ, q₀, F): states, alphabet, transition function, initial state, accepting states
- Complete FSM behavior is determined by the current state alone - history is not needed
- DFA: exactly one transition per input per state. NFA: one input can produce multiple simultaneous transitions
- DFA and NFA are equivalent in expressive power (Rabin-Scott theorem, 1959) - they recognize exactly the same class of regular languages
- Regular expressions = FSMs via Kleene's theorem (1956): every pattern compiles to a state graph
- ReDoS is an attack on backtracking engines: nested quantifiers create an exponential number of paths. Defense: RE2, Go regexp (NFA-simulation, O(n))
Вопросы для размышления
- Consider an app used daily. Which state machines run explicitly in it (loading states, forms, navigation), and which implicitly (auth flow, online/offline transitions)?
Связанные уроки
- fl-01-intro — Regular languages are exactly what finite automata recognize
- aut-02-markov-chains — Markov chains are probabilistic FSMs with transition probabilities
- nlp-01 — Tokenizers (tiktoken) are DFAs over Unicode; FSMs underlie text processing
- comp-01 — Turing machines extend finite automata with an infinite memory tape
- fl-06-dfa