Probability Theory

Expected Value

Цели урока

  • Understand the meaning of expected value as "long-run average"
  • Learn to compute E[X] for discrete and continuous random variables
  • Master the key property: linearity E[X + Y] = E[X] + E[Y]
  • Apply expected value to decision-making
  • Understand the limitations - when E[X] "lies"

Предварительные знания

  • Random variables - discrete and continuous
  • Basic integrals (for continuous random variables)
  • Random Variables
  • Sequences: How Infinity Converges

A casino offers a game: roll a die and receive $10 for each pip. What is the fair price of participation? Intuition says - find the "average". Expected value is exactly that average, computed with mathematical rigor.

  • 🎰 **Casinos:** Why the house always wins (negative E[X] for the player)
  • 📈 **Investing:** Expected return of a portfolio
  • 🏥 **Insurance:** Calculating premiums and payouts
  • 🎮 **Gamedev:** Loot balance, drop rates, average damage
  • 🤖 **ML:** Loss function = E[error]
  • 📊 **A/B tests:** Average effect of a change

The fair price of a gamble

It all started with a correspondence between two geniuses - Blaise Pascal and Pierre de Fermat in 1654. They were interested in: **what is the "fair" price to enter a gamble?** The answer is the expected value of the winnings. If E[winnings] > entry price - it's worth playing. If less - it's not. But in 1738, Daniel Bernoulli showed with the famous St. Petersburg Paradox that E[X] alone is not enough. This led to the creation of **utility theory** and modern economics.

Expected Value

**1738, St. Petersburg.** Swiss mathematician Daniel Bernoulli poses a problem that has kept economists and philosophers awake for 300 years.

**The game:** Flip a coin until tails appears. If tails comes up on the $n$-th flip - the prize is $2^n$ dollars. What is the **fair price** of participation?

First tails? $2. Second? $4. Third? $8. Tenth? $1,024. Twentieth? **Over a million.**

Let's compute the **expected winnings**... and get **infinity**. Mathematics says: the fair price is any amount, even a billion! But is it reasonable to pay even $100 for this game?

Today we'll explore **expected value** - a tool that usually works brilliantly... but sometimes breaks intuition.

The expectation $\mathbb{E}[X]$ of a discrete random variable is:

$\mathbb{E}[X]$ is a weighted sum, with weights being probabilities. It is the 'long-run average' over many repetitions. For continuous variables: $\mathbb{E}[X]=\int x f(x)dx$.

📐 What is expected value?

📐 What is expected value?

**Expected value** (mean, expectation) is the **weighted average** of all values of a random variable, where the weights are the probabilities.

**For a discrete random variable:**

**For a continuous random variable with density $f(x)$:**

Consider the probability distribution as mass spread along the number line. $E[X]$ is the point where this mass **balances**. The center of gravity. Running many, many experiments and averaging the results yields a number close to $E[X]$. This is the **law of large numbers**.

🎲 Average number of pips on a die

Classic example

$X$ - number of pips on a fair die. $$E[X] = 1 \cdot \frac{1}{6} + 2 \cdot \frac{1}{6} + 3 \cdot \frac{1}{6} + 4 \cdot \frac{1}{6} + 5 \cdot \frac{1}{6} + 6 \cdot \frac{1}{6}$$ $$= \frac{1+2+3+4+5+6}{6} = \frac{21}{6} = 3.5$$ **Note:** 3.5 is **not** a possible die value! But rolling 1,000 times and averaging yields a number near 3.5.

A lottery: with probability 0.001 the prize is $1,000, otherwise $0. What is the "fair" price of a ticket?

$E[\text{winnings}] = 1000 \cdot 0.001 + 0 \cdot 0.999 = 1$ **Fair price = $1.** If a ticket costs less - it's worth buying. If more - the lottery profits. Real lotteries sell tickets above E[X] - otherwise how would they make money?

✨ Properties of expected value

✨ Properties of expected value

$$E[aX + b] = a \cdot E[X] + b$$ $$E[X + Y] = E[X] + E[Y]$$ **The second one holds ALWAYS** - even for dependent random variables! This is incredibly powerful: E[X + Y] can be computed without knowing the joint distribution.

🎲🎲 Sum of two dice

Linearity in action

$X_1, X_2$ - pips on the first and second die. $E[X_1] = E[X_2] = 3.5$ $$E[X_1 + X_2] = E[X_1] + E[X_2] = 3.5 + 3.5 = 7$$ **Average sum = 7.** That's why 7 is the most "popular" number in board games and craps! Note: we didn't enumerate all 36 combinations - linearity did it for us.

🚌 Uniform distribution

$X \sim U(a, b)$

A bus arrives uniformly at random on the interval $[a, b]$. $$E[X] = \int_a^b x \cdot \frac{1}{b-a}\,dx = \frac{1}{b-a} \cdot \frac{x^2}{2}\Big|_a^b$$ $$= \frac{b^2 - a^2}{2(b-a)} = \frac{(b-a)(b+a)}{2(b-a)} = \frac{a+b}{2}$$ **The midpoint of the interval!** Intuitively clear: all points are equally likely, so the average is in the middle.

X and Y are two random variables (possibly dependent). E[X] = 5, E[Y] = 3. What is E[2X - Y + 10]?

By linearity: $$E[2X - Y + 10] = 2 \cdot E[X] - E[Y] + 10 = 2 \cdot 5 - 3 + 10 = 17$$ **Linearity holds ALWAYS** - regardless of whether X and Y are dependent. It does NOT work for $E[X \cdot Y]$ - that requires independence!

🎰 Application: why the house always wins

🎰 Application: why the house always wins

🎡 American roulette

Betting on red

Bet $1 on red. • Red (18/38): win $1 • Not red (20/38): lose $1 $$E[\text{profit}] = 1 \cdot \frac{18}{38} + (-1) \cdot \frac{20}{38} = \frac{18-20}{38} = -\frac{2}{38}$$ $$\approx -5.26\text{ cents per dollar}$$ **The casino takes ~5.3%** of every bet. With long enough play - a loss **is inevitable**. Over a million $1 bets: the casino earns ~$52,600.

$E[X] = 3.5$ for a die doesn't mean 3.5 will come up. $E[X]$ is the **average over many** repetitions. For a single game the result can be anything! **Law of large numbers:** the more repetitions, the closer the average gets to $E[X]$.

🛡️ Insurance: when "unfavorable" = "rational"

E[X] is not the only criterion

Phone insurance: $100/year. Probability of damage: 5%. Repair: $500. $E[\text{without insurance}] = 500 \cdot 0.05 = 25$ $E[\text{with insurance}] = 100$ (the premium) **Mathematically** insurance is unfavorable ($100 > $25). **But!** Insurance eliminates **risk**. For many people, a stable -$100 is better than a chance of -$500. More on this in the lesson on variance and utility theory.

A game: flip a coin. Heads pays $10, tails costs $6. What is E[profit]?

$E[\text{profit}] = 10 \cdot 0.5 + (-6) \cdot 0.5 = 5 - 3 = 2$ **Expected profit = $2 > 0.** Playing is **profitable** (in the long run). Over 1,000 plays - the average win is about $2,000.

🃏 Resolving the St. Petersburg Paradox

🃏 Resolving the St. Petersburg Paradox

Back to the game from the beginning. Tails on flip $n$ → win $2^n$.

**Expected winnings = ∞!** Yet nobody would pay even $100 for this game. Why?

**1. Infinite casino bankroll** To pay $2^{30}$ (over a billion), the casino needs billions. In practice - it doesn't. **2. Diminishing utility** $1 million to a poor person is life-changing. Another $1 million to a billionaire is pocket change. The utility of money grows more slowly than the amount. **3. Finite lifespan** To "profit" from this game, millions of plays would be required. Human lives are finite.

**Conclusion:** $E[X]$ is a powerful tool, but not the only criterion. **Variance**, **utility**, and **real-world constraints** also matter.

If the game is capped (the casino pays at most $1,000,000), what is E[X]?

With a $1M cap: maximum ~20 flips. $E[X] = \sum_{n=1}^{20} 1 = 20$ Adding the tail with the capped payout gives E[X] ≈ $20-21. **A reasonable price to play: ~$20.** The paradox disappears!

🏋️ Practice

🏋️ Practice

RV X: P(X = -2) = 0.3, P(X = 1) = 0.5, P(X = 4) = 0.2. Find E[X] and E[3X - 1].

$E[X] = (-2) \cdot 0.3 + 1 \cdot 0.5 + 4 \cdot 0.2$ $= -0.6 + 0.5 + 0.8 = 0.7$ $E[3X - 1] = 3 \cdot 0.7 - 1 = 2.1 - 1 = 1.1$

Density of an RV: f(x) = 2x for 0 ≤ x ≤ 1, otherwise 0. Find E[X].

$E[X] = \int_0^1 x \cdot 2x\,dx = \int_0^1 2x^2\,dx$ $= \frac{2x^3}{3}\Big|_0^1 = \frac{2}{3}$ The mean is 2/3, closer to 1 than to 0 - because the density increases with x.

A box contains 5 white and 3 black balls. Two balls are drawn (without replacement). X = number of white balls drawn. Find E[X].

**Method 1 (combinatorics):** $P(X=0) = C_3^2/C_8^2 = 3/28$ $P(X=1) = C_5^1 C_3^1/C_8^2 = 15/28$ $P(X=2) = C_5^2/C_8^2 = 10/28$ $E[X] = 0 \cdot 3/28 + 1 \cdot 15/28 + 2 \cdot 10/28 = 35/28 = 5/4$ **Method 2 (linearity):** $X = X_1 + X_2$, where $X_i$ = indicator "i-th ball is white" $E[X_1] = P(\text{1st is white}) = 5/8$ $E[X_2] = P(\text{2nd is white}) = 5/8$ (by symmetry!) $E[X] = 5/8 + 5/8 = 10/8 = 5/4 = 1.25$

E[X·Y] = E[X]·E[Y] always

E[X·Y] = E[X]·E[Y] ONLY for independent X and Y

For dependent random variables this is **wrong**! Example: $X$ - number on a die, $Y = X$ (the same). $E[X] = E[Y] = 3.5$ $E[X \cdot Y] = E[X^2] = (1+4+9+16+25+36)/6 = 91/6 \approx 15.17$ But $E[X] \cdot E[Y] = 3.5 \cdot 3.5 = 12.25 \neq 15.17$ Dependence changes everything!

X and Y are independent. E[X] = 2, E[Y] = 3, E[X²] = 5, E[Y²] = 10. What is E[XY]?

For **independent** random variables: $E[XY] = E[X] \cdot E[Y] = 2 \cdot 3 = 6$ The information about $E[X^2]$ and $E[Y^2]$ isn't needed here - it will come in handy for variance!

E[X] - the center of gravity of the distribution

Expected value is the starting point for all of statistics.

  • Variance — A measure of "spread" around E[X] - next lesson!
  • Law of Large Numbers — Sample mean → E[X] as n → ∞
  • Loss function in ML — Minimizing E[loss] is the essence of training
  • Game theory — Expected payoff determines optimal strategy

Key ideas

  • **Discrete RV:** $E[X] = \sum x_i \cdot p_i$
  • **Continuous RV:** $E[X] = \int x \cdot f(x)\,dx$
  • **Linearity:** $E[aX + b] = aE[X] + b$, $E[X+Y] = E[X] + E[Y]$ (always!)
  • **E[XY] = E[X]·E[Y]** only for independent random variables
  • **Interpretation:** "long-run average" when the experiment is repeated
  • **Limitations:** St. Petersburg Paradox - E[X] isn't everything!

Вопросы для размышления

  • 🃏 Back to the paradox: what is the fair price for the St. Petersburg game? Why exactly that amount?
  • 🎰 If E[X] < 0 for all casino games - why do people still play?
  • 📊 When can decisions based on E[X] be catastrophically wrong?
  • 🤖 Why do ML models minimize E[loss] rather than max(loss)?

Связанные уроки

  • calc-01-sequences
  • ml-09-gradient-descent
  • stat-02-estimation
Expected Value

0

1

Sign In