Functional Analysis
Free probability
Цели урока
- Understand the operator probability space (A, phi) and the notion of free independence
- Master the R-transform and its linearity under free sums
- Know the Marchenko-Pastur law as the limit of Wishart matrices and its tie to the free CLT
- Understand free entropy chi and the semicircle law as the analog of the normal distribution
Предварительные знания
- Von Neumann algebras and type II_1 factors
- K-theory of C*-algebras
- Random matrix theory basics
Why is the spectrum of a sum of two random matrices not just the sum of their spectra but the 'free convolution', and how does that explain 5G MIMO antenna capacity?
- 5G MIMO channels: MIMO capacity with many antennas is computed through an integral against the Marchenko-Pastur law. A direct application of free probability
- Deep network analysis (Martin-Mahoney 2019): the spectra of weight matrices in trained networks deviate from Marchenko-Pastur, a signature of 'effective learning'
- Quantum information: free probability describes asymptotics of arbitrary quantum channels through KK-theory
- Finance: the asset return covariance matrix with p ~ N matches Marchenko-Pastur. It separates market signal from noise
Voiculescu: from factors to random matrices
Dan Voiculescu introduced free probability in 1985 while working on the isomorphism of factors L(F_n). His insight: 'freeness' is the right analog of independence for non-commuting operators. In 1991 he discovered that random matrices become asymptotically free as N -> infinity. The R-transform (1986) linearizes free addition. Free entropy (1993 to 1998) is the noncommutative analog of Shannon's differential entropy. The Marchenko-Pastur law (1967, predating Voiculescu) is explained by the free CLT. In 2004 Voiculescu received the Abel Prize for creating free probability and proving asymptotic freeness of random matrices.
Free independence and operator probability spaces
1985. Dan Voiculescu is studying the isomorphism of von Neumann factors L(F_n) and L(F_m) of free groups. He discovers a notion of 'free' independence under which summing random N x N matrices, as N -> infinity, obeys 'free' analogs of the classical probability theorems.
Freeness in matrix approximations
Random matrices as free variables
Take two independent GUE matrices A_N and B_N of size N x N. As N -> infinity their empirical spectral measures converge to deterministic limits mu_A and mu_B. Voiculescu's key fact: A_N and B_N become 'asymptotically free'. The distribution of A_N + B_N tends to mu_A boxplus mu_B (the free convolution), computed through the R-transform.
Free independence is the noncommutative analog of classical independence. Classical: phi(f(a) * g(b)) = phi(f(a)) * phi(g(b)). Free: alternating centered products have zero expectation. For commutative algebras freeness does not match independence.
What is the principled difference between free independence and the classical one?
Free independence: phi(p_1(a) * q_1(b) * ...) = 0 when phi(p_i(a)) = phi(q_j(b)) = 0. It is a condition on alternating products, not factorization. The difference matters under non-commutativity.
R-transform and free convolution
Voiculescu's R-transform is the noncommutative analog of the logarithm of the characteristic function. In classical probability: log phi(e^{i xi (X + Y)}) = log phi(e^{i xi X}) + log phi(e^{i xi Y}) for independent X, Y. For free variables: R_{a + b}(z) = R_a(z) + R_b(z). That makes the R-transform the main tool of free analysis.
What plays the role of the logarithm of the characteristic function in free probability?
R-transform: G_a(R_a(z) + 1/z) = z. Linearity: R_{a boxplus b}(z) = R_a(z) + R_b(z). The analog of the log characteristic function. It linearizes free addition.
Free entropy and random matrix applications
Voiculescu introduced free entropy chi(a_1, ..., a_n) between 1993 and 1998 as an analog of Shannon's differential entropy for non-commuting variables. The semicircle law maximizes free entropy, the analog of the normal distribution. Applications include MIMO channel capacity in wireless communication and spectral analysis of deep neural networks.
Spectral analysis of deep neural networks
Free probability in ML
Martin and Mahoney (2019): the spectral density of weight matrices in trained deep networks deviates from the Marchenko-Pastur law. Well-trained networks have 'heavy tails', a power-law density. This is a signature of effective compression: the weight matrix stores more information than a random one. The Martin-Mahoney alpha_hat metric is based on comparing against a random (free-probability) baseline.
Which distribution maximizes free entropy among distributions of fixed variance?
Free analogy: normal ~ semicircle. The semicircle law maximizes chi(a) at E[a^2] = sigma^2. That is why GUE matrices are the 'most random' among self-adjoint matrices.
Free CLT and the isomorphism of factors
Free central limit theorem: the normalized free sum of i.i.d. centered variables converges to the semicircle law. The parallel with the classical CLT is precise. But the open question (50+ years): are the von Neumann factors L(F_2) and L(F_3) of free groups isomorphic?
The L(F_2) ~ L(F_3) isomorphism problem has been open since 1967. Voiculescu's free probability offers a new approach through free entropy chi and dimension delta_0. Radulescu (1994) showed that L(F_n) ~ L(F_m) holds for all n, m >= 2 simultaneously, or for none. An 'all or nothing' dichotomy, but which of the two remains unknown.
Which law is the limit of the normalized free sum S_N = (a_1 boxplus ... boxplus a_N) / sqrt(N)?
Free CLT: S_N = (a_1 boxplus ... boxplus a_N) / sqrt(N) -> semicircle law mu_sc as N -> infinity. Analogy with classical: normalized i.i.d. sums -> Gaussian. Semicircle is the 'Gaussian' of free probability.
Links to other areas
Free probability unites operator theory, combinatorics, and applied mathematics.
- C*-algebras and von Neumann algebras — Free probability lives inside tracial von Neumann algebras
- Connes' noncommutative geometry — Voiculescu developed free probability as a piece of noncommutative geometry
- Functional analysis in ML — Spectra of large random matrices in neural networks follow free probability laws
Итоги
- Operator probability space (A, phi): *-algebra plus trace. Freeness via alternating centered products
- R-transform: G_a(R_a(z) + 1/z) = z. Linearity R_{a boxplus b} = R_a + R_b for free a, b
- Marchenko-Pastur law: spectral limit of X X^T / p as N, p -> infinity, p / N -> gamma. Support [(1 - sqrt(gamma))^2, (1 + sqrt(gamma))^2]
- Free CLT: S_N = (a_1 boxplus ... boxplus a_N) / sqrt(N) -> semicircle law mu_sc
- Free entropy chi: analog of Shannon. The semicircle law maximizes chi at fixed variance
Вопросы для размышления
- For non-commuting random variables factorization phi(a * b) = phi(a) * phi(b) is not the right analog of independence. What replaces it?
- How are asymptotic freeness of random matrices as N -> infinity and Voiculescu's theorem for von Neumann algebras connected?
- What is the meaning of the open problem L(F_2) ~ L(F_3), and why does free entropy fail to settle it once and for all?