Optimal Transport
Geodesics and Displacement Interpolation
Displacement interpolation is the mathematical foundation unifying diffusion models, flow matching, and optimal transport. Stable Diffusion literally travels a geodesic in measure space.
- **Rectified Flow (Liu 2022):** optimal flow = straight paths = displacement interpolation. Stable Diffusion 3 and Flux use this for 10x speedup.
- **Color transfer (Reinhard 2001):** displacement interpolation for transferring color palette between images. Used in film post-production.
- **Shape morphing:** geodesic between two 3D shapes = smooth deformation. Applied in medical imaging (MRI brain atlases).
McCann Interpolation: Geodesics in Wasserstein Space
**Diffusion models (Stable Diffusion, 2022) implement displacement interpolation in probability measure space , a 14-step path from noise to a 512×512 image.** A geodesic in (𝒫₂, W₂) is not a mixture of distributions but a simultaneous straight-line motion of every particle of mass.
Displacement interpolation differs from a linear mixture (1−t)μ₀ + tμ₁: the mixture creates a bimodal distribution, while the geodesic smoothly translates mass. Diffusion models travel the geodesic from N(0,I) to p_data.
How does displacement interpolation differ from the linear mixture (1−t)μ₀ + tμ₁?
A linear mixture creates an intermediate measure with two modes. Displacement interpolation moves each particle in a straight line , the mode remains single and shifts smoothly from μ₀ to μ₁.
Brenier's Theorem: Existence of the Optimal Transport Map
Monge sought a deterministic map T: X→Y transporting μ to ν at minimum cost. **Brenier's theorem (1991)** establishes existence and uniqueness for the quadratic cost: T = ∇φ for some convex function φ.
Brenier's theorem requires absolute continuity of μ (no atoms). In the discrete case the optimal map may be non-unique , use the Kantorovich plan instead.
What is the optimal OT map T: μ→ν for one-dimensional distributions?
In 1D the optimal map is the monotone rearrangement T = F_ν⁻¹(F_μ(x)): each quantile of μ is sent to the corresponding quantile of ν.
Benamou-Brenier Formula: Dynamic Formulation of W₂
The static W₂ formulation searches over plans γ. The **Benamou-Brenier formula (2000)** reformulates it as a fluid dynamics problem: find a velocity field v(x,t) transporting ρ₀ to ρ₁ with minimum kinetic energy.
The Benamou-Brenier formula is the basis for numerical W₂ solvers via convex saddle-point optimization. Flow matching in diffusion models directly parametrizes the optimal velocity field v*.
What quantity does the Benamou-Brenier formula minimize to compute W₂²?
Benamou-Brenier: W₂² is the minimum kinetic energy of a velocity field transporting μ to ν subject to the continuity equation.
Key Ideas
- **Displacement interpolation:** μ_t = ((1−t)id + tT*)#μ₀. Each particle moves in a straight line to its image under the optimal map.
- **Brenier's theorem:** optimal map T* = ∇φ (gradient of a convex function). In 1D: T = F_ν⁻¹ ∘ F_μ.
- **W₂² via quantiles (1D):** W₂²(μ,ν) = ∫₀¹ |F_μ⁻¹(q) − F_ν⁻¹(q)|² dq.
- **Benamou-Brenier:** W₂² = min kinetic energy of flow. Optimal velocity v* = T(x)−x (constant along the geodesic).
Связанные уроки
- ot-18 — Uses the W₂ metric from the previous lesson
- ot-14-gradient-flows — Geodesics underlie Wasserstein gradient flows