Probability Theory

Infinite-Dimensional Probability

Цели урока

  • Understand Gaussian measures on Banach spaces through one-dimensional projections
  • Master Minlos's theorem and the role of nuclearity of the covariance operator
  • Analyze the construction of Wiener space via Kolmogorov's theorem
  • Connect Fernique concentration and Cameron-Martin to modern ML

Предварительные знания

  • Hilbert and Banach spaces
  • Weak convergence of probability measures
  • Gaussian distributions and covariance matrices
  • Compact and nuclear operators
  • Exchangeability and de Finetti's Theorem

How does one build a probability measure on the space of all continuous functions - and why does DALL-E and Stable Diffusion need it?

  • **Diffusion models:** DALL-E, Stable Diffusion generate images by inverting a Gaussian diffusion process in image space
  • **Quantum field theory:** Feynman path integrals are measures on quantum-particle trajectory spaces
  • **Neural Network Gaussian Process:** the limit of an infinitely wide network is a Gaussian process with NTK kernel
  • **Gaussian Process Regression:** a measure on function space for Bayesian ML - the standard in Bayesian optimization

Gaussian Measures on Banach Spaces

In finite dimensions a Gaussian is defined by the density exp(-x^2/2). In infinite dimensions neither Lebesgue measure nor a density exists; a Gaussian measure is characterized through all one-dimensional projections. This is the mathematical foundation of the Neural Network Gaussian Process - the limiting model of infinitely wide neural networks.

A standard Gaussian vector X = (X_1, X_2, ...) on R^infty has ||X||^2 = sum X_i^2 = infinity almost surely. So a 'standard' Gaussian measure on R^infty does not exist; one needs weighted norms or restriction to subspaces with finite covariance trace.

Why does Minlos's theorem require nuclearity of the covariance operator?

Wiener Space

Norbert Wiener in 1923 constructed a probability measure on C([0,1]) of continuous functions, realizing Brownian motion as a standard object. Today this measure underlies diffusion generative models: DALL-E 3 and Stable Diffusion are interpreted as measures on C([0,1] x R^d) - the path space in the image space.

Karhunen-Loeve construction of Wiener measure: W_t = sum (xi_n / (n*pi)) sin(n*pi*t), where xi_n are iid N(0,1). This is the expansion in eigenfunctions of the covariance operator - a standard tool in infinite-dimensional probability.

What is the Holder exponent of a Brownian path almost surely?

Gaussian Concentration and Cameron-Martin

A remarkable property of Gaussian measures in Banach spaces is strong concentration of the norm around its mean. Fernique's inequality gives exponentially small tails for ||X||, the analog of classical Gaussian concentration in R^n. This underlies generalization bounds for infinitely wide neural networks in NTK theory.

Infinite-dimensional probability underlies modern ML

Probability measures on function spaces unite random processes, functional analysis, and Bayesian methods.

  • Wiener space — Wiener measure on C([0,1]) is the canonical space for Brownian motion, built via Kolmogorov's theorem
  • Malliavin calculus — The Malliavin derivative differentiates on Wiener space; the Cameron-Martin space is the 'smooth' direction
  • Neural Tangent Kernel — Limit of an infinitely wide neural network is a Gaussian process with NTK; training is a shift of the Gaussian measure in H
  • Diffusion models — Stable Diffusion and DALL-E work with measures on functional spaces of noise-removal trajectories

Итоги

  • **Cylindrical measures:** defined on finite-dimensional projections, extended by Kolmogorov
  • **Minlos's theorem:** the characteristic functional with nuclear C defines the Gaussian measure
  • **No Lebesgue measure:** no sigma-finite translation-invariant measure exists in infinite dimensions
  • **Cameron-Martin:** shift of gamma by h is absolutely continuous iff h is in H
  • **Dichotomy:** for h outside H, gamma_h and gamma are mutually singular
  • **Fernique:** exponential concentration of the norm; Bogolyubov-Sudakov isoperimetry
  • **Applications:** Wiener space, NTK, diffusion models, Gaussian processes

When is a shift of a Gaussian measure gamma by vector h absolutely continuous with respect to gamma?

Infinite-Dimensional Probability

0

1

Sign In