Measure Theory

Ergodic Theory

MCMC (Markov Chain Monte Carlo) works precisely because of the ergodic theorem: a Markov chain is a measure-preserving ergodic map, and the time average along the trajectory equals the integral over the target distribution. All of Bayesian statistics rests on this fact.

  • MCMC: Metropolis-Hastings, NUTS - the ergodic theorem guarantees convergence
  • Statistical physics: the ergodic hypothesis is the foundation of thermodynamics
  • Equidistribution of {nα}: Weyl's theorem as a consequence of the ergodic theorem
  • Shannon's source theorem: code length ≈ entropy (consequence of the ergodic theorem)

Measure-Preserving Maps

Ergodic theory underpins Netflix shuffle algorithms: a proper shuffle must preserve the uniform measure , a measure-preserving map. A map T: (X, 𝒜, μ) → (X, 𝒜, μ) is called **measure-preserving** if μ(T⁻¹(E)) = μ(E) for all E ∈ 𝒜 (equivalently: μ ∘ T⁻¹ = μ). Intuitively: T 'shuffles' the points of the space without changing the measure of sets. Examples: rotation of the circle, torus shift, expanding maps.

**Examples of measure-preserving maps:** 1. **Circle rotation:** T(x) = x + α (mod 1) on [0,1) with Lebesgue measure. Every rotation is measure-preserving. 2. **Bernoulli shift:** T on {0,1}^ℤ (infinite binary sequences), left shift. This is a model of 'coin flips in time'. 3. **Doubling map:** T(x) = 2x (mod 1) on [0,1). Measure-preserving, but non-invertible! 4. **Hamiltonian flow:** in mechanics - flow along a Hamiltonian vector field. Liouville's theorem: phase space volume is preserved. 5. **Gauss map:** T(x) = {1/x} (fractional part of 1/x) - measure-preserving for the Gauss measure dx/(1+x) on [0,1). Connection with continued fractions!

A map T preserves the measure μ. What does this mean formally?

Ergodicity: Only Directly Invariant Sets

A measure-preserving map T is called **ergodic** if T⁻¹(E) = E ⟹ μ(E) ∈ {0, μ(X)} (invariant sets are trivial: measure 0 or full measure). Intuition: the system 'mixes' so thoroughly that no nontrivial part of the space is closed under T.

**Ergodicity: examples and counterexamples** **Ergodic:** - Circle rotation by an irrational angle (α ∉ ℚ): the orbit {nα mod 1} is dense in [0,1) - Doubling map T(x) = 2x mod 1 on [0,1) - Bernoulli shift on {0,1}^ℤ **NOT ergodic:** - Circle rotation by a rational angle α = p/q ∈ ℚ: each orbit is finite (q points), many invariant sets - T = identity map: every point is an invariant set **Practical characterization:** T is ergodic ↔ every T-invariant function f is constant (a.e.): f∘T = f ⟹ f = const μ-a.e.

Why is circle rotation by α = 1/3 not ergodic?

Birkhoff's Theorem: Time = Space

**Birkhoff's Ergodic Theorem (1931):** If T is a measure-preserving map and f ∈ L¹(μ), then the time average converges a.e. and in L¹: lim_{N→∞} (1/N) Σ_{n=0}^{N-1} f(Tⁿx) = E[f|𝒥], where 𝒥 is the σ-algebra of T-invariant sets. **If T is ergodic,** then E[f|𝒥] = ∫f dμ = const, i.e., **time average = space average**.

**The meaning of Birkhoff's theorem:** Time average: ⟨f⟩_time = lim_{N→∞} (f(x) + f(Tx) + ... + f(T^{N-1}x)) / N Space average: ⟨f⟩_space = ∫_X f dμ For ergodic T: ⟨f⟩_time = ⟨f⟩_space (for a.e. starting point x) **Applications:** - **Statistical physics:** The ergodic hypothesis is the foundation of thermodynamics. Time average = ensemble average. - **Number theory:** Weyl's theorem: (1/N)Σ e^{2πi·nα·k} → 0 for irrational α (equidistribution) - **Data compression:** Shannon-McMillan-Breiman theorem: optimal code length ≈ source entropy - **ML:** Markov chains: MCMC converges to the stationary distribution by the ergodic theorem

What does Birkhoff's ergodic theorem state for ergodic T?

Key Ideas

  • Measure-preserving T: μ(T⁻¹(E)) = μ(E) - the measure is invariant under T
  • Ergodicity: T⁻¹(E) = E ⟹ μ(E) = 0 or μ(X) (only trivial invariants)
  • Irrational rotation - ergodic; rational rotation - not (finite orbits)
  • Birkhoff's theorem: (1/N)Σf(Tⁿx) → ∫f dμ (a.e.) when T is ergodic
  • Time average = space average (ergodic hypothesis in physics)
  • MCMC = ergodic sampling: time averages → integrals

Related Topics

Ergodic theory lies at the intersection of measure theory, dynamical systems, and probability:

  • Regular Measures — Invariant measures of ergodic systems are regular Borel measures
  • Convergence Theorems — Birkhoff's theorem - a.e. convergence, analogous to the SLLN for dependent variables

Вопросы для размышления

  • What is the difference between the ergodic hypothesis in physics and Birkhoff's mathematical theorem?
  • Why do MCMC algorithms require ergodicity of the Markov chain? What happens when it fails?
  • How is Birkhoff's theorem related to the Strong Law of Large Numbers for i.i.d. random variables?

Связанные уроки

  • prob-05-independence
Ergodic Theory

0

1

Sign In