Numerical Methods
Multigrid Methods
Цели урока
- Understand the multigrid acceleration principle: why iterative methods slowly damp smooth errors and how coarse-grid transfer solves this
- Master the V-cycle algorithm: restriction, coarse-grid smoothing, prolongation of correction
- Understand AMG for matrices without explicit geometry
Предварительные знания
- Direct Methods for Linear Systems
- Iterative Methods (Jacobi, Gauss-Seidel)
How to solve a PDE on a 10-million-node grid in seconds when direct methods are O(N^3) and classical iterative methods are O(N^2)?
- Ansys Fluent: CFD simulations on 10^6 nodes in seconds instead of hours
- GROMACS: molecular dynamics of proteins with PME electrostatics via FFT-multigrid
- NOAA/ECMWF: weather forecasting on 10^7-node grids - impossible in real time without AMG
- U-Net in Stable Diffusion: encoder-decoder architecture is V-cycle implemented as a neural network
Birth of the Multigrid Method
Achi Brandt (Weizmann Institute) published the seminal paper in 1977 that changed computational mathematics. Earlier ideas existed - Fedorenko (1961, USSR) and Bakhvalov (1966) - but Brandt provided the complete theory: smoothing analysis, compatibility criterion, and the FMG algorithm. Wolfgang Hackbusch (1978, Germany) independently developed the W-cycle theory.
The Multigrid Idea: Why Smooth Errors Are the Hard Part
Ansys Fluent solves the Poisson equation on a 10^6-node grid in 2 seconds. Without multigrid - 200 seconds. A factor of 100 from mathematics alone, not hardware.
The fundamental paradox of iterative methods: Gauss-Seidel and Jacobi brilliantly damp high-frequency (oscillatory) errors but catastrophically slowly eliminate low-frequency (smooth) errors. Smooth errors are the survivors. The multigrid insight is elegant: transfer the smooth error to a coarser grid, where it becomes oscillatory again - and dies quickly.
Key V-cycle operators: restriction I_h^{2h} (fine to coarse, weighted averaging), prolongation I_{2h}^h (coarse to fine, linear interpolation), smoother S_h (Jacobi or Gauss-Seidel, kills high-frequency error).
Total V-cycle work: O(N) + O(N/4) + O(N/16) + ... = O(N) * 4/3 = O(N). The geometric series delivers optimal complexity. This is exactly what NVIDIA cuSPARSE and PETSc use in production HPC.
Why does the multigrid method achieve O(N) complexity for an N-node problem?
Total work across levels: O(N)*(1 + 1/4 + 1/16 + ...) = O(N)*4/3 = O(N). The geometric series gives optimal complexity.
W-Cycle, FMG, and Convergence Analysis
The V-cycle is not the only option. The W-cycle makes two recursive calls per level instead of one, achieving better convergence for stiff problems at the cost of O(N log N) instead of O(N). Full Multigrid (FMG) uses the V-cycle on successively refined grids as a starting approximation - reaching discretization accuracy in a single V-cycle.
For 3D problems, the optimal smoother is red-black Gauss-Seidel: it admits parallelization and gives smoothing factor ~0.5, twice better than standard Jacobi.
What distinguishes the W-cycle from the V-cycle in terms of complexity and use case?
W-cycle: each level calls itself twice, total work O(N log N). Better for poorly conditioned problems. FMG combines V-cycle with hierarchical initialization, achieving discretization accuracy in O(N).
Algebraic Multigrid (AMG)
Classical multigrid requires geometry. AMG constructs the hierarchy automatically from the matrix entries. FEM on unstructured meshes, graph Laplacians, elasticity problems - AMG handles all of them.
AMG powers modern HPC solvers: Hypre (Lawrence Livermore), Trilinos ML (Sandia), ANSYS Mechanical. PyAMG provides a Python implementation for research and prototyping.
Why does AMG build the coarse grid from the matrix rather than geometry?
Geometric multigrid needs explicit structure (grids, interpolation). AMG builds the hierarchy from matrix entries: C/F splitting by connection strength, interpolation weights from a_ij. Works for arbitrary sparse systems.
Multigrid in ML: U-Net as a V-Cycle
The connection between multigrid methods and deep learning is not metaphorical. U-Net (Ronneberger et al., 2015), used in Stable Diffusion, medical segmentation, and satellite imagery, is structurally a V-cycle implemented as a neural network.
U-Net encoder: progressively coarsens the representation (restriction). U-Net decoder: progressively refines (prolongation). Skip connections: pass fine-scale details directly from encoder to decoder (analogous to high-frequency corrections in the V-cycle). The parallel was intentional - Brandt and Hackbusch's work inspired the U-Net authors.
Neural Multigrid: Learned Operators
Greenfeld et al. (NeurIPS 2019) replaced fixed restriction and prolongation operators with neural networks. The network learns optimal operators for a class of problems. For anisotropic elliptic equations, this gives 30-50% faster convergence than classical AMG.
Which neural network architecture is the direct analog of the multigrid V-cycle?
U-Net is structurally identical to the V-cycle: contracting path (restriction to coarser levels), expanding path (prolongation), skip connections (passing high-frequency details directly, like corrections in the V-cycle).
Connections to Other Topics
Multigrid methods sit at the intersection of linear algebra, functional analysis, and algorithms. The restriction operator connects to wavelet transforms (nm-22). AMG applies to graph problems (spectral clustering). U-Net in deep learning is a direct neural implementation of the V-cycle.
- Related Topics — Related topic
Итоги
- Iterative smoothers (Jacobi, Gauss-Seidel) efficiently damp high-frequency errors but not smooth ones; transferring to a coarser grid turns smooth errors into oscillatory ones
- V-cycle: smooth -> restrict -> coarse-grid recursion -> prolong correction -> smooth; complexity O(N)
- AMG builds the hierarchy automatically from matrix entries via C/F splitting by connection strength - no explicit geometry needed
- U-Net (Stable Diffusion, medical segmentation) is the neural implementation of the V-cycle with learned operators
Вопросы для размышления
- Why does a smooth error on the fine grid become oscillatory on the coarse grid? How does this relate to the Nyquist sampling theorem?
- AMG builds the prolongation operator from the matrix. What happens if AMG is applied to a dense matrix with global connections?
- How does the V-cycle manifest in U-Net - encoder, decoder, skip connections? What plays the role of the smoother?
Связанные уроки
- nm-22 — FFT is the basis of spectral multigrid smoothers
- nm-23 — spectral methods offer the next accuracy level
- nm-06 — iterative methods are the heart of the smoother
- nm-05 — direct methods solve the coarse-grid problem exactly
- la-01-vectors-intro