Probability Theory
Expected Value
Цели урока
- Understand the meaning of expected value as "long-run average"
- Learn to compute E[X] for discrete and continuous random variables
- Master the key property: linearity E[X + Y] = E[X] + E[Y]
- Apply expected value to decision-making
- Understand the limitations - when E[X] "lies"
Предварительные знания
- Random variables - discrete and continuous
- Basic integrals (for continuous random variables)
A casino offers a game: roll a die and receive $10 for each pip. What is the fair price of participation? Intuition says - find the "average". Expected value is exactly that average, computed with mathematical rigor.
- 🎰 **Casinos:** Why the house always wins (negative E[X] for the player)
- 📈 **Investing:** Expected return of a portfolio
- 🏥 **Insurance:** Calculating premiums and payouts
- 🎮 **Gamedev:** Loot balance, drop rates, average damage
- 🤖 **ML:** Loss function = E[error]
- 📊 **A/B tests:** Average effect of a change
The fair price of a gamble
It all started with a correspondence between two geniuses - Blaise Pascal and Pierre de Fermat in 1654. They were interested in: **what is the "fair" price to enter a gamble?** The answer is the expected value of the winnings. If E[winnings] > entry price - it's worth playing. If less - it's not. But in 1738, Daniel Bernoulli showed with the famous St. Petersburg Paradox that E[X] alone is not enough. This led to the creation of **utility theory** and modern economics.
Expected Value
**1738, St. Petersburg.** Swiss mathematician Daniel Bernoulli poses a problem that has kept economists and philosophers awake for 300 years.
**The game:** Flip a coin until tails appears. If tails comes up on the $n$-th flip - the prize is $2^n$ dollars. What is the **fair price** of participation?
First tails? $2. Second? $4. Third? $8. Tenth? $1,024. Twentieth? **Over a million.**
Let's compute the **expected winnings**... and get **infinity**. Mathematics says: the fair price is any amount, even a billion! But is it reasonable to pay even $100 for this game?
Today we'll explore **expected value** - a tool that usually works brilliantly... but sometimes breaks intuition.
The expectation $\mathbb{E}[X]$ of a discrete random variable is:
$\mathbb{E}[X]$ is a weighted sum, with weights being probabilities. It is the 'long-run average' over many repetitions. For continuous variables: $\mathbb{E}[X]=\int x f(x)dx$.
📐 What is expected value?
📐 What is expected value?
**Expected value** (mean, expectation) is the **weighted average** of all values of a random variable, where the weights are the probabilities.
**For a discrete random variable:**
**For a continuous random variable with density $f(x)$:**
Consider the probability distribution as mass spread along the number line. $E[X]$ is the point where this mass **balances**. The center of gravity. Running many, many experiments and averaging the results yields a number close to $E[X]$. This is the **law of large numbers**.
🎲 Average number of pips on a die
Classic example
$X$ - number of pips on a fair die. $$E[X] = 1 \cdot \frac{1}{6} + 2 \cdot \frac{1}{6} + 3 \cdot \frac{1}{6} + 4 \cdot \frac{1}{6} + 5 \cdot \frac{1}{6} + 6 \cdot \frac{1}{6}$$ $$= \frac{1+2+3+4+5+6}{6} = \frac{21}{6} = 3.5$$ **Note:** 3.5 is **not** a possible die value! But rolling 1,000 times and averaging yields a number near 3.5.
A lottery: with probability 0.001 the prize is $1,000, otherwise $0. What is the "fair" price of a ticket?
$E[\text{winnings}] = 1000 \cdot 0.001 + 0 \cdot 0.999 = 1$ **Fair price = $1.** If a ticket costs less - it's worth buying. If more - the lottery profits. Real lotteries sell tickets above E[X] - otherwise how would they make money?
✨ Properties of expected value
✨ Properties of expected value
$$E[aX + b] = a \cdot E[X] + b$$ $$E[X + Y] = E[X] + E[Y]$$ **The second one holds ALWAYS** - even for dependent random variables! This is incredibly powerful: E[X + Y] can be computed without knowing the joint distribution.
🎲🎲 Sum of two dice
Linearity in action
$X_1, X_2$ - pips on the first and second die. $E[X_1] = E[X_2] = 3.5$ $$E[X_1 + X_2] = E[X_1] + E[X_2] = 3.5 + 3.5 = 7$$ **Average sum = 7.** That's why 7 is the most "popular" number in board games and craps! Note: we didn't enumerate all 36 combinations - linearity did it for us.
🚌 Uniform distribution
$X \sim U(a, b)$
A bus arrives uniformly at random on the interval $[a, b]$. $$E[X] = \int_a^b x \cdot \frac{1}{b-a}\,dx = \frac{1}{b-a} \cdot \frac{x^2}{2}\Big|_a^b$$ $$= \frac{b^2 - a^2}{2(b-a)} = \frac{(b-a)(b+a)}{2(b-a)} = \frac{a+b}{2}$$ **The midpoint of the interval!** Intuitively clear: all points are equally likely, so the average is in the middle.
X and Y are two random variables (possibly dependent). E[X] = 5, E[Y] = 3. What is E[2X - Y + 10]?
By linearity: $$E[2X - Y + 10] = 2 \cdot E[X] - E[Y] + 10 = 2 \cdot 5 - 3 + 10 = 17$$ **Linearity holds ALWAYS** - regardless of whether X and Y are dependent. It does NOT work for $E[X \cdot Y]$ - that requires independence!
🎰 Application: why the house always wins
🎰 Application: why the house always wins
🎡 American roulette
Betting on red
Bet $1 on red. • Red (18/38): win $1 • Not red (20/38): lose $1 $$E[\text{profit}] = 1 \cdot \frac{18}{38} + (-1) \cdot \frac{20}{38} = \frac{18-20}{38} = -\frac{2}{38}$$ $$\approx -5.26\text{ cents per dollar}$$ **The casino takes ~5.3%** of every bet. With long enough play - a loss **is inevitable**. Over a million $1 bets: the casino earns ~$52,600.
$E[X] = 3.5$ for a die doesn't mean 3.5 will come up. $E[X]$ is the **average over many** repetitions. For a single game the result can be anything! **Law of large numbers:** the more repetitions, the closer the average gets to $E[X]$.
🛡️ Insurance: when "unfavorable" = "rational"
E[X] is not the only criterion
Phone insurance: $100/year. Probability of damage: 5%. Repair: $500. $E[\text{without insurance}] = 500 \cdot 0.05 = 25$ $E[\text{with insurance}] = 100$ (the premium) **Mathematically** insurance is unfavorable ($100 > $25). **But!** Insurance eliminates **risk**. For many people, a stable -$100 is better than a chance of -$500. More on this in the lesson on variance and utility theory.
A game: flip a coin. Heads pays $10, tails costs $6. What is E[profit]?
$E[\text{profit}] = 10 \cdot 0.5 + (-6) \cdot 0.5 = 5 - 3 = 2$ **Expected profit = $2 > 0.** Playing is **profitable** (in the long run). Over 1,000 plays - the average win is about $2,000.
🃏 Resolving the St. Petersburg Paradox
🃏 Resolving the St. Petersburg Paradox
Back to the game from the beginning. Tails on flip $n$ → win $2^n$.
**Expected winnings = ∞!** Yet nobody would pay even $100 for this game. Why?
**1. Infinite casino bankroll** To pay $2^{30}$ (over a billion), the casino needs billions. In practice - it doesn't. **2. Diminishing utility** $1 million to a poor person is life-changing. Another $1 million to a billionaire is pocket change. The utility of money grows more slowly than the amount. **3. Finite lifespan** To "profit" from this game, millions of plays would be required. Human lives are finite.
**Conclusion:** $E[X]$ is a powerful tool, but not the only criterion. **Variance**, **utility**, and **real-world constraints** also matter.
If the game is capped (the casino pays at most $1,000,000), what is E[X]?
With a $1M cap: maximum ~20 flips. $E[X] = \sum_{n=1}^{20} 1 = 20$ Adding the tail with the capped payout gives E[X] ≈ $20-21. **A reasonable price to play: ~$20.** The paradox disappears!
🏋️ Practice
🏋️ Practice
RV X: P(X = -2) = 0.3, P(X = 1) = 0.5, P(X = 4) = 0.2. Find E[X] and E[3X - 1].
$E[X] = (-2) \cdot 0.3 + 1 \cdot 0.5 + 4 \cdot 0.2$ $= -0.6 + 0.5 + 0.8 = 0.7$ $E[3X - 1] = 3 \cdot 0.7 - 1 = 2.1 - 1 = 1.1$
Density of an RV: f(x) = 2x for 0 ≤ x ≤ 1, otherwise 0. Find E[X].
$E[X] = \int_0^1 x \cdot 2x\,dx = \int_0^1 2x^2\,dx$ $= \frac{2x^3}{3}\Big|_0^1 = \frac{2}{3}$ The mean is 2/3, closer to 1 than to 0 - because the density increases with x.
A box contains 5 white and 3 black balls. Two balls are drawn (without replacement). X = number of white balls drawn. Find E[X].
**Method 1 (combinatorics):** $P(X=0) = C_3^2/C_8^2 = 3/28$ $P(X=1) = C_5^1 C_3^1/C_8^2 = 15/28$ $P(X=2) = C_5^2/C_8^2 = 10/28$ $E[X] = 0 \cdot 3/28 + 1 \cdot 15/28 + 2 \cdot 10/28 = 35/28 = 5/4$ **Method 2 (linearity):** $X = X_1 + X_2$, where $X_i$ = indicator "i-th ball is white" $E[X_1] = P(\text{1st is white}) = 5/8$ $E[X_2] = P(\text{2nd is white}) = 5/8$ (by symmetry!) $E[X] = 5/8 + 5/8 = 10/8 = 5/4 = 1.25$
E[X·Y] = E[X]·E[Y] always
E[X·Y] = E[X]·E[Y] ONLY for independent X and Y
For dependent random variables this is **wrong**! Example: $X$ - number on a die, $Y = X$ (the same). $E[X] = E[Y] = 3.5$ $E[X \cdot Y] = E[X^2] = (1+4+9+16+25+36)/6 = 91/6 \approx 15.17$ But $E[X] \cdot E[Y] = 3.5 \cdot 3.5 = 12.25 \neq 15.17$ Dependence changes everything!
X and Y are independent. E[X] = 2, E[Y] = 3, E[X²] = 5, E[Y²] = 10. What is E[XY]?
For **independent** random variables: $E[XY] = E[X] \cdot E[Y] = 2 \cdot 3 = 6$ The information about $E[X^2]$ and $E[Y^2]$ isn't needed here - it will come in handy for variance!
E[X] - the center of gravity of the distribution
Expected value is the starting point for all of statistics.
- Variance — A measure of "spread" around E[X] - next lesson!
- Law of Large Numbers — Sample mean → E[X] as n → ∞
- Loss function in ML — Minimizing E[loss] is the essence of training
- Game theory — Expected payoff determines optimal strategy
Key ideas
- **Discrete RV:** $E[X] = \sum x_i \cdot p_i$
- **Continuous RV:** $E[X] = \int x \cdot f(x)\,dx$
- **Linearity:** $E[aX + b] = aE[X] + b$, $E[X+Y] = E[X] + E[Y]$ (always!)
- **E[XY] = E[X]·E[Y]** only for independent random variables
- **Interpretation:** "long-run average" when the experiment is repeated
- **Limitations:** St. Petersburg Paradox - E[X] isn't everything!
Вопросы для размышления
- 🃏 Back to the paradox: what is the fair price for the St. Petersburg game? Why exactly that amount?
- 🎰 If E[X] < 0 for all casino games - why do people still play?
- 📊 When can decisions based on E[X] be catastrophically wrong?
- 🤖 Why do ML models minimize E[loss] rather than max(loss)?