Measure Theory

Product Measures and Fubini's Theorem

How is an expectation over a joint distribution computed? Why does marginalization in Bayesian models work the way it does? Fubini's theorem is the mathematical justification for integrating one variable at a time. Monte Carlo integration is its numerical counterpart.

**Joint distributions:** independence of random variables means the joint distribution is a product measure; P_{(X,Y)} = P_X × P_Y
**Bayesian marginalization:** p(y|x) = ∫ p(y|θ,x) p(θ) dθ is an iterated integral over a product measure, justified by Fubini's theorem
**Monte Carlo integration:** the numerical implementation of Fubini; error is O(1/√N) regardless of dimension

Предварительные знания

Duality and the Riesz Representation Theorem

Product Sigma-Algebras and Measures

How do we rigorously define integration over multiple variables? The answer is to build a measure on the product of spaces. This is precisely how joint distributions in probability theory are formalized.

**Product sigma-algebra:** for measurable spaces (X, F) and (Y, G), the **product sigma-algebra** F ⊗ G is the smallest sigma-algebra on X×Y containing all 'rectangles' A×B, where A ∈ F and B ∈ G. **Product measure:** for sigma-finite measures μ on (X,F) and ν on (Y,G), there is a unique measure μ×ν on (X×Y, F⊗G) such that: (μ×ν)(A×B) = μ(A) · ν(B)

**Independence = product measure:** random variables X and Y are independent if and only if their joint distribution P_{(X,Y)} equals the product measure P_X × P_Y. This is the fundamental definition, requiring no assumption about the existence of densities.

The Borel sigma-algebra on ℝ² equals B(ℝ) ⊗ B(ℝ), the product of the one-dimensional Borel sigma-algebras. This fundamental fact guarantees that the standard two-dimensional Lebesgue measure is a product measure λ × λ.

Random variables X and Y are independent. What does this mean in terms of product measures?

Fubini's Theorem and Tonelli's Theorem

Can the order of integration be exchanged in a double integral? For the Riemann integral this was a delicate question. Fubini's and Tonelli's theorems give precise conditions for the Lebesgue integral.

**Tonelli's theorem (non-negative case):** if f ≥ 0 is measurable on (X×Y, F⊗G), then: ∫_{X×Y} f d(μ×ν) = ∫_X (∫_Y f(x,y) dν(y)) dμ(x) = ∫_Y (∫_X f(x,y) dμ(x)) dν(y) The order of integration can be exchanged freely, with no extra conditions. **Fubini's theorem:** if f ∈ L¹(μ×ν), the same equality holds for sign-changing f.

**Marginalization as iterated integration:** in Bayesian statistics, the marginal likelihood p(y) = ∫ p(y|θ) p(θ) dθ is an iterated integral over the product measure. Fubini's theorem guarantees that integrating over θ first (for fixed y) gives the same result as any other valid order.

**Monte Carlo integration** is the numerical version of Fubini's theorem: ∫∫ f(x,y) dx dy ≈ (1/N) Σ f(xᵢ, yᵢ) with (xᵢ,yᵢ) drawn from μ×ν. Fubini guarantees that this estimate is consistent regardless of the order in which the variables are sampled.

Fubini's theorem allows the order of integration to be exchanged when:

When Fubini Fails: A Counterexample

What happens when the L¹ condition is violated? The classic counterexample shows two iterated integrals that give different values. This is not a contradiction; it simply means the function is not L¹-integrable over the product space.

**Fubini counterexample:** define on [0,1]×[0,1]: f(x,y) = (x² − y²) / (x² + y²)² Then: - ∫₀¹ (∫₀¹ f(x,y) dy) dx = π/4 - ∫₀¹ (∫₀¹ f(x,y) dx) dy = −π/4 The two iterated integrals give **different values**! The reason: f ∉ L¹([0,1]²), that is ∫∫ |f| d(λ×λ) = ∞.

**Practical lesson for ML:** when computing E_{(x,y)~P}[f(x,y)] via iterated integrals, first over x, then over y, verify that E[|f(X,Y)|] < ∞. If violated (e.g., heavy-tailed distributions), different integration orders can give different answers.

In deep learning, Fubini counterexamples can appear when computing gradients of expected losses with non-integrable tails. Always verify E[|L(θ,X)|] < ∞ before exchanging expectation and differentiation.

If ∫₀¹(∫₀¹ f dy)dx ≠ ∫₀¹(∫₀¹ f dx)dy, what does this imply?

Monte Carlo as Numerical Fubini

Monte Carlo integration is the numerical realization of Fubini's theorem. An integral over a product measure is estimated as a sample average. Measure theory explains why this works and what its accuracy is.

**Monte Carlo for a double integral:** by the strong law of large numbers and Fubini's theorem: E_{(x,y)~μ×ν}[f(x,y)] = ∫∫ f d(μ×ν) ≈ (1/N) Σᵢ f(xᵢ, yᵢ) where (xᵢ, yᵢ) ~ μ×ν (independent samples from the product measure). The error is O(1/√N), independent of dimension!

Quasi-Monte Carlo (QMC) replaces random points with low-discrepancy sequences (Sobol, Halton). This connects even more directly to Fubini: the iterated integration error converges at O(log(N)^d/N) rather than O(1/√N), a significant gain in moderate dimensions.

The main advantage of Monte Carlo for high-dimensional integration is:

Key Ideas

**Product measure μ×ν** is the unique measure on X×Y with (μ×ν)(A×B) = μ(A)·ν(B); independence means joint = product
**Tonelli:** for f ≥ 0, the order of integration is freely interchangeable; **Fubini:** for f ∈ L¹(μ×ν), the same holds for sign-changing f
**Counterexample:** f(x,y) = (x²−y²)/(x²+y²)² gives different iterated integrals because f ∉ L¹
**Monte Carlo:** E_{μ×ν}[f] ≈ (1/N) Σ f(xᵢ,yᵢ) with O(1/√N) error, dimension-free

Вопросы для размышления

Why doesn't Tonelli's theorem require an L¹ condition? What happens when a non-negative function has infinite integral, can the iterated integrals still differ?
Monte Carlo requires f ∈ L¹ in theory. How does this constraint show up in practice when computing expectations with heavy-tailed distributions?
In variational autoencoders, the ELBO is E_{q(z)}[log p(x|z)] minus KL(q(z)||p(z)). Where does the product measure structure appear in this formula?

Связанные уроки

calc-17-multivariable

Measure Theory

Product Measures and Fubini's Theorem

**Joint distributions:** independence of random variables means the joint distribution is a product measure; P_{(X,Y)} = P_X × P_Y
**Bayesian marginalization:** p(y|x) = ∫ p(y|θ,x) p(θ) dθ is an iterated integral over a product measure, justified by Fubini's theorem
**Monte Carlo integration:** the numerical implementation of Fubini; error is O(1/√N) regardless of dimension

Предварительные знания

Duality and the Riesz Representation Theorem

Product Sigma-Algebras and Measures

Random variables X and Y are independent. What does this mean in terms of product measures?

Fubini's Theorem and Tonelli's Theorem

Fubini's theorem allows the order of integration to be exchanged when:

When Fubini Fails: A Counterexample

If ∫₀¹(∫₀¹ f dy)dx ≠ ∫₀¹(∫₀¹ f dx)dy, what does this imply?

Monte Carlo as Numerical Fubini

The main advantage of Monte Carlo for high-dimensional integration is:

Key Ideas

**Product measure μ×ν** is the unique measure on X×Y with (μ×ν)(A×B) = μ(A)·ν(B); independence means joint = product
**Tonelli:** for f ≥ 0, the order of integration is freely interchangeable; **Fubini:** for f ∈ L¹(μ×ν), the same holds for sign-changing f
**Counterexample:** f(x,y) = (x²−y²)/(x²+y²)² gives different iterated integrals because f ∉ L¹
**Monte Carlo:** E_{μ×ν}[f] ≈ (1/N) Σ f(xᵢ,yᵢ) with O(1/√N) error, dimension-free

Вопросы для размышления

Why doesn't Tonelli's theorem require an L¹ condition? What happens when a non-negative function has infinite integral, can the iterated integrals still differ?
Monte Carlo requires f ∈ L¹ in theory. How does this constraint show up in practice when computing expectations with heavy-tailed distributions?
In variational autoencoders, the ELBO is E_{q(z)}[log p(x|z)] minus KL(q(z)||p(z)). Where does the product measure structure appear in this formula?

Связанные уроки

calc-17-multivariable

Product Measures and Fubini's Theorem

Предварительные знания

Product Sigma-Algebras and Measures

Fubini's Theorem and Tonelli's Theorem

When Fubini Fails: A Counterexample

Monte Carlo as Numerical Fubini

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки

Product Measures and Fubini's Theorem

Предварительные знания

Product Sigma-Algebras and Measures

Fubini's Theorem and Tonelli's Theorem

When Fubini Fails: A Counterexample

Monte Carlo as Numerical Fubini

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки