Functional Analysis
Introduction to Spectral Theory
Why does the conjugate gradient method converge in a finite number of steps? Why do kernel methods (kernel SVM) work? The answer is spectral theory: the structure of an operator's spectrum determines both the convergence speed of iterative solvers and the expressive power of kernel methods.
- **Iterative solvers**: convergence rate of Jacobi iterations = spectral radius of the iteration matrix; preconditioned CG minimizes iteration count by optimizing the spectrum of the preconditioned matrix
- **Kernel methods (SVM, GP)**: the Gram matrix generates a compact integral operator; Mercer's theorem guarantees k(x,y) = sum lambda_i phi_i(x) phi_i(y) via the operator's eigenfunctions
- **PageRank**: the stationary distribution is the eigenvector for lambda = 1; computed by power iterations in O(n) due to the spectral gap |lambda_1| - |lambda_2|
Предварительные знания
Spectrum and Resolvent
For a bounded linear operator T: X -> X on a Banach space X, define: **resolvent set** rho(T) = {lambda in C : (T - lambda I)^{-1} exists and is bounded}, **spectrum** sigma(T) = C minus rho(T). Unlike matrices, the spectrum in infinite dimensions splits into three disjoint parts.
**Three parts of the spectrum**: 1. **Point spectrum** sigma_p(T): lambda such that T - lambda I is not injective (eigenvalues, Tx = lambda x). 2. **Continuous spectrum** sigma_c(T): T - lambda I is injective, its range is dense, but (T - lambda I)^{-1} is unbounded. 3. **Residual spectrum** sigma_r(T): T - lambda I is injective but its range is not dense.