Algebra

Eigenvalues and Eigenvectors

PCA, PageRank, spectral clustering, stability analysis of control systems - all are eigenvalue problems. In neural networks, the spectral radius of the weight matrix determines whether gradients explode or vanish. Eigendecomposition is the X-ray of a linear operator.

**PCA:** eigenvectors of the covariance matrix are principal components; eigenvalues are the variance explained along each axis
**Google PageRank:** dominant eigenvector of the transition matrix = page ranks; power iteration is the actual algorithm
**Spectral clustering:** graph Laplacian L = D − W; the k smallest nonzero eigenvalues reveal k clusters in the graph

Предварительные знания

Linear Maps

The Eigenvalue Equation

A nonzero vector v is an eigenvector of matrix A if Av = λv for some scalar λ (the eigenvalue). Geometrically: an eigenvector is a direction that A only stretches or compresses (or flips), never rotates. λ is the stretch factor - positive means same direction, negative means flip.

**Geometric meaning:** eigenvectors are the 'axes' of the transformation. Any vector can be expanded in eigenvectors, and each component simply scales by its eigenvalue. This is why eigendecomposition is so powerful for understanding matrix behavior.

If Av = 2v, what is A²v?

Characteristic Polynomial

Av = λv is equivalent to (A−λI)v = 0. A nonzero solution exists if and only if det(A−λI) = 0. This is the characteristic equation. For an n×n matrix, det(A−λI) is a degree-n polynomial in λ - its roots are the eigenvalues.

**Cayley-Hamilton theorem:** every matrix satisfies its own characteristic polynomial p(A) = 0. Consequence: Aⁿ can be expressed as a linear combination of A^{n-1}, …, A, I. This enables efficient computation of matrix powers without direct exponentiation.

The characteristic polynomial of a 3×3 matrix has degree:

Eigendecomposition and Symmetric Matrices

If A has n linearly independent eigenvectors v₁,…,vₙ with eigenvalues λ₁,…,λₙ, then A = VΛV⁻¹, where V has eigenvectors as columns and Λ = diag(λ₁,…,λₙ). For symmetric matrices (A = Aᵀ): all eigenvalues are real and eigenvectors are orthogonal → A = VΛVᵀ (spectral theorem).

Use np.linalg.eigh for symmetric matrices - it's twice as fast and returns real eigenvalues in ascending order. Use np.linalg.eig for general (possibly non-symmetric) matrices.

For a symmetric matrix A = Aᵀ, eigenvalues are guaranteed to be:

Power Iteration and Applications

Power iteration: start with a random v₀, then v_{k+1} = Av_k / ||Av_k||. When |λ₁| > |λ₂|, this converges to the eigenvector of the largest-magnitude eigenvalue λ₁. Convergence rate: |λ₂/λ₁|^k - faster when eigenvalues are well separated.

Application	What we seek	Matrix
PCA	Directions of maximum variance	Covariance matrix
PageRank	Stationary distribution	Transition matrix
Markov chains	Equilibrium distribution	Stochastic matrix
Spectral clustering	Cluster structure in graph	Graph Laplacian D − W

Power iteration converges to the eigenvector corresponding to:

Key Ideas

**Av = λv:** eigenvector is an invariant direction; λ is the stretch factor; eigenvalues are roots of det(A−λI)=0
**Characteristic polynomial** has degree n for an n×n matrix; gives n eigenvalues in ℂ (counting multiplicity)
**A = VΛV⁻¹:** eigendecomposition; for symmetric A = VΛVᵀ with orthogonal V (spectral theorem)
**Power iteration** converges to the dominant eigenvector in O(n) per step; foundation of PageRank and RNN gradient analysis

Вопросы для размышления

In RNNs, if the spectral radius ρ(W) > 1, gradients explode; if < 1, they vanish. How does this follow directly from eigendecomposition and matrix powers?
PCA via eigendecomposition and via SVD give the same result. Why is SVD numerically preferred for large datasets?
The graph Laplacian has n eigenvalues with the smallest always 0. What does the multiplicity of λ=0 tell you about the graph's connectivity?

Связанные уроки

la-01-vectors-intro

The Eigenvalue Equation

If Av = 2v, what is A²v?

Characteristic Polynomial

The characteristic polynomial of a 3×3 matrix has degree:

Eigendecomposition and Symmetric Matrices

Use np.linalg.eigh for symmetric matrices - it's twice as fast and returns real eigenvalues in ascending order. Use np.linalg.eig for general (possibly non-symmetric) matrices.

For a symmetric matrix A = Aᵀ, eigenvalues are guaranteed to be:

Power Iteration and Applications

Application

What we seek

Matrix

PCA

Directions of maximum variance

Covariance matrix

PageRank

Stationary distribution

Transition matrix

Markov chains

Equilibrium distribution

Stochastic matrix

Spectral clustering

Cluster structure in graph

Graph Laplacian D − W

Power iteration converges to the eigenvector corresponding to:

Key Ideas

**Av = λv:** eigenvector is an invariant direction; λ is the stretch factor; eigenvalues are roots of det(A−λI)=0

**Characteristic polynomial** has degree n for an n×n matrix; gives n eigenvalues in ℂ (counting multiplicity)

**A = VΛV⁻¹:** eigendecomposition; for symmetric A = VΛVᵀ with orthogonal V (spectral theorem)

**Power iteration** converges to the dominant eigenvector in O(n) per step; foundation of PageRank and RNN gradient analysis

Eigenvalues and Eigenvectors

Предварительные знания

The Eigenvalue Equation

Characteristic Polynomial

Eigendecomposition and Symmetric Matrices

Power Iteration and Applications

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки

Eigenvalues and Eigenvectors

Предварительные знания

The Eigenvalue Equation

Characteristic Polynomial

Eigendecomposition and Symmetric Matrices

Power Iteration and Applications

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки