Differential Equations
The Heat Equation
Why do image blurring and coffee cooling obey the same equation? Both are diffusion. Gaussian blur in Photoshop literally solves the heat equation. So do the diffusion models behind Stable Diffusion.
- **Gaussian blur = heat equation:** convolving an image with a Gaussian kernel of width σ is equivalent to running the heat equation to time t = σ²/2.
- **GPU thermal maps:** simulating heat distribution in processors and GPUs for cooling system design is a direct numerical solution of ∂T/∂t = α∇²T.
- **Diffusion models (AI):** DDPM (Stable Diffusion, DALL-E)-forward process = discretized heat equation with stochastic noise. The network learns to reverse it.
Предварительные знания
Derivation of the Heat Equation
Fourier's law of heat conduction: heat flux **q = -κ∇T** (heat flows from hot to cold). Energy conservation for a volume element: ρc·∂T/∂t = -div(q) = κ∇²T. Defining **α = κ/(ρc)** (thermal diffusivity) gives: **∂u/∂t = α∇²u**.
**Physical meaning:** u(x,t) is temperature at position x and time t. The Laplacian ∇²u measures how much u at a point differs from the local average. If u is below the neighborhood average, heat flows in and u rises.
| Quantity | Symbol | Units |
|---|---|---|
| Thermal conductivity | κ | W/(m·K) |
| Density | ρ | kg/m³ |
| Specific heat capacity | c | J/(kg·K) |
| Thermal diffusivity α=κ/(ρc) | α | m²/s |
| Copper: α | 1.17×10⁻⁴ m²/s | fast |
| Wood: α | ~10⁻⁷ m²/s | slow |
The heat equation is a **parabolic** PDE. Boundary conditions: Dirichlet (u specified on boundary), Neumann (∂u/∂n specified-flux), Robin (αu + β∂u/∂n = g-convection). Cauchy problem: given u(x,0) = f(x) as the initial temperature distribution.
A rod has insulated ends (Neumann condition ∂u/∂x = 0). What happens to the temperature as t → ∞?
Separation of Variables
Assume **u(x,t) = X(x)·T(t)**. Substituting into ∂u/∂t = α∂²u/∂x² and dividing by αXT: **T'/(αT) = X''/X = -λ**. The left side depends only on t, the right only on x-both must equal a constant -λ.
Eigenvalue problem for X: X'' + λX = 0 with Dirichlet conditions X(0) = X(L) = 0. Solution: **λₙ = (nπ/L)², Xₙ(x) = sin(nπx/L)**, n = 1, 2, 3, ... For T: T' = -αλₙT → **Tₙ(t) = e^{-αλₙt}**.
**General solution:** u(x,t) = Σ bₙ sin(nπx/L) e^{-α(nπ/L)²t}. Each harmonic decays at its own rate. High-frequency harmonics (large n) decay n² times faster-the heat equation is a smoother.
In u(x,t) = Σ bₙ sin(nπx/L)e^{-α(nπ/L)²t}, which harmonic decays fastest?
The Heat Kernel
For the Cauchy problem on the whole line (no boundaries), the solution is a convolution with the **heat kernel**: **G(x,t) = (4παt)^{-1/2} exp(-x²/(4αt))**. Given u(x,0) = f(x): **u(x,t) = ∫_{-∞}^{∞} G(x-y,t) f(y) dy**-a convolution with a Gaussian!
The heat kernel G(x,t) is a Gaussian with variance σ²(t) = 2αt. As t→0: G → δ(x). As t→∞: G→ 0 (heat spreads over the whole line). This is the mathematical basis of **Gaussian blur** in computer vision.
**Diffusion models in generative AI:** Stable Diffusion and DALL-E use a forward process (adding noise) = discretized heat equation with a stochastic term (SDE). The neural network learns to reverse it-solving the backward heat equation.
Einstein and Brownian motion (1905)
In one of his annus mirabilis papers Einstein derived that the concentration of colloidal particles obeys the heat equation with diffusion coefficient D = kT/(6πηr). This linked Fourier's macroscopic equation to the microscopic randomness of molecules and indirectly proved atoms exist. Perrin confirmed the formula experimentally in 1908 (Nobel 1926). The heat kernel (4παt)^(-1/2)·exp(-x²/(4αt)) is exactly the Gaussian density with σ² = 2αt: every 'unit of heat' performs a Brownian walk.
The same formula now drives SDE models in finance (Black-Scholes), physics, and diffusion models in generative AI.
Gaussian blur with σ=2 is equivalent to solving the heat equation at what time t (α=1)?
Numerical Solution: FTCS Scheme
The **FTCS** (Forward Time, Centered Space) scheme approximates derivatives with finite differences. Time derivative: (uⁿ⁺¹ᵢ - uⁿᵢ)/Δt ≈ ∂u/∂t. Space: (uⁿᵢ₊₁ - 2uⁿᵢ + uⁿᵢ₋₁)/Δx² ≈ ∂²u/∂x². Explicit update: **uⁿ⁺¹ᵢ = uⁿᵢ + r·(uⁿᵢ₊₁ - 2uⁿᵢ + uⁿᵢ₋₁)**, where r = αΔt/Δx².
**FTCS stability condition:** r = αΔt/Δx² ≤ 1/2. If r > 1/2 the scheme blows up. For stable time-stepping without this restriction, use the **Crank-Nicolson implicit scheme** (covered in the finite difference lesson).
Grid: Δx = 0.1, α = 0.01. What is the maximum Δt for FTCS stability?
Key Ideas
- **∂u/∂t = α∇²u**-derived from Fourier's law and energy conservation.
- **Separation of variables:** u = X(x)T(t) leads to an eigenvalue problem. Solution: sine series with exponential decay.
- **Heat kernel G(x,t) = (4παt)^{-1/2}e^{-x²/(4αt)}**-Gaussian with σ = √(2αt). Gaussian blur = convolution with the heat kernel.
- **FTCS scheme:** explicit, simple, conditionally stable for r = αΔt/Δx² ≤ 1/2.
Related Topics
The heat equation is the prototype of parabolic PDEs:
- Fourier Methods — Fourier series solution = expanding the initial condition in eigenfunctions
- The Wave Equation — Replacing the first time derivative with the second switches from diffusion to wave propagation
- Finite Difference Method — FTCS is the simplest finite difference scheme for parabolic PDEs
Вопросы для размышления
- The heat kernel G(x,t) is a Gaussian with σ = √(2αt). How does this connect to the central limit theorem and random walks?
- Diffusion models add noise (forward process) and learn to remove it. Why is the forward process precisely the heat equation?
- FTCS blows up when r > 1/2. Explain physically why a time step that is too large causes instability?