Calculus
Introduction to Series
Цели урока
- Understand what a numerical series and partial sums are
- Master the concepts of convergence and divergence of a series
- Derive the formula for the sum of a geometric series
- Prove the divergence of the harmonic series
- Apply the Leibniz test for alternating series
Предварительные знания
- Sequences and their limits
- Arithmetic and geometric progressions
- The concept of a limit: $\lim_{n \to \infty} a_n$
- Properties of limits: sum, product
Zeno of Elea had a proof. Achilles and the tortoise: before reaching the tortoise, Achilles must reach where it was. Then where it moved. Then again. Infinitely many steps - so he never arrives. Mathematicians laughed at the paradox for 2000 years. Then someone actually computed it: $\frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \ldots = 1$. Infinitely many terms. Finite sum. Zeno was wrong - and this discovery is exactly what powers `torch.sin(x)`, JPEG compression, and the momentum accumulation inside Adam during GPT training.
- **`torch.sin(x)` in PyTorch**: internally computed via Taylor series $x - x^3/6 + x^5/120 - \ldots$ - the first 10 terms give precision $10^{-15}$. An infinite series, truncated to a finite number of terms
- **Adam optimizer**: the momentum $m_t = \beta_1 m_{t-1} + (1-\beta_1)g_t$ expands to a geometric series of gradients. With $\beta_1 = 0.9$, terms decay as $0.9^k$, the series converges - this is why training stabilizes
- **JPEG compression**: each 8x8 image block is decomposed into a Fourier series - a sum of cosines at different frequencies. Terms with small coefficients are discarded. 64 terms describe 64 pixels
- **Euler's Basel problem**: $\sum_{n=1}^\infty \frac{1}{n^2} = \frac{\pi^2}{6} \approx 1.6449$ - a 1735 formula appearing in statistics and quantum physics
- **Continuous compounding**: $e^r = \lim_{n \to \infty}(1 + r/n)^n = 1 + r + r^2/2! + \ldots$ - every bank and every discounting model computes through this series
Zeno, Archimedes, and infinity
Zeno of Elea (~450 BC) formulated paradoxes about infinite division: how can Achilles catch the tortoise if he must traverse infinitely many intervals? Archimedes (~250 BC) essentially computed sums of series to find areas. But a rigorous theory was created only in the 19th century. And in 1735 Euler solved the famous Basel problem: $\sum \frac{1}{n^2} = \frac{\pi^2}{6}$, astonishing the mathematical world.
An infinite sum with a finite answer
An infinite sum with a finite answer
Zeno is technically right: adding numbers one by one is a process that never ends. But this is not addition - it is a **limit**. That is the difference between a series and a plain sum.
Half the distance, then a quarter, then an eighth... After each step, exactly half the remaining gap is left. The terms shrink. Fast. Asymptotically:
The sum cannot exceed 1 - each next step adds less than what is missing. And it equals exactly 1. Not "approximately", not "tends to" - **equals**. This is a theorem, verifiable through the limit of partial sums.
**Zeno's paradox and Adam optimizer**. The same structure lives inside Adam - the most widely used optimizer in ML. The momentum $m_t = 0.9 \cdot m_{t-1} + 0.1 \cdot g_t$ expands to: $m_t = 0.1 g_t + 0.1 \cdot 0.9 \cdot g_{t-1} + 0.1 \cdot 0.9^2 \cdot g_{t-2} + \ldots$ A geometric series with ratio $q = 0.9 < 1$. It converges - so the optimizer is stable.
Definition of a numerical series
A **numerical series** is an expression of the form:
Here $a_n$ are the **terms of the series**. Literally adding infinitely many numbers is impossible. So the sum of a series is defined through a limit. A series is not addition - it is a process with a limiting state.
Partial sums - the key to understanding
The **partial sum** $S_n$ is the sum of the first $n$ terms:
- $S_1 = a_1$ - only the first term
- $S_2 = a_1 + a_2$ - first two
- $S_3 = a_1 + a_2 + a_3$ - first three
- $S_n = a_1 + a_2 + \ldots + a_n$ - first $n$ terms
The partial sums form a **sequence** $S_1, S_2, S_3, \ldots$ - and here the theory from the previous lesson kicks in. If this sequence converges - the series converges.
If this limit exists and is finite - the series **converges**. If the limit does not exist or is infinite - the series **diverges**.
Walking to the wall
Series 1/2 + 1/4 + 1/8 + ...
Back to our example: $\frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \ldots$ Partial sums: - $S_1 = 0.5$ - $S_2 = 0.5 + 0.25 = 0.75$ - $S_3 = 0.875$, $S_4 = 0.9375$, $S_5 = 0.96875$ - $S_{10} \approx 0.999$ Pattern: $S_n = 1 - \frac{1}{2^n}$ As $n \to \infty$: $\frac{1}{2^n} \to 0$, so $S_n \to 1$ **The sum of the infinite series equals 1.** ML parallel: a training loss curve has the same shape - each epoch halves the distance to the minimum (under ideal conditions). Loss never reaches zero, but the limit exists.
| n | $S_n$ | Distance to limit |
|---|---|---|
| 1 | 0.5 | 0.5 |
| 2 | 0.75 |
The geometric series - the main character
The geometric series - the main character
The series above is a special case of the **geometric series**: each term is obtained by multiplying the previous one by the common ratio $q$. This series appears literally everywhere in ML - exponential moving averages in Adam, cosine LR decay, discounting in reinforcement learning.
There is an **exact closed-form formula** for it. When $|q| < 1$:
When does the geometric series diverge?
The formula works **only when $|q| < 1$**. What happens for $|q| \geq 1$ - these are the same three divergence modes as for sequences:
- $q = 1$: $1 + 1 + 1 + 1 + \ldots = \infty$ - runaway. In ML: learning rate too large, loss explodes
- $q = -1$: $1 - 1 + 1 - 1 + \ldots$ - oscillates between 0 and 1, no limit. In ML: optimizer jumps over the minimum
- $q = 2$: $1 + 2 + 4 + 8 + \ldots = \infty$ - exponential explosion
- $q = -2$: $1 - 2 + 4 - 8 + \ldots$ - oscillations with growing amplitude. Unstable training.
The condition $|q| < 1$ is exactly the condition mathematicians impose on the learning rate in SGD convergence theory. The ratio $q$ in the geometric series and the contraction coefficient in the gradient descent iteration are the same object.
What is the sum of the series $1 + \frac{1}{3} + \frac{1}{9} + \frac{1}{27} + \ldots$?
This is a geometric series with $q = \frac{1}{3}$. By the formula: $\frac{1}{1 - 1/3} = \frac{1}{2/3} = \frac{3}{2}$.
The harmonic series - a treacherous surprise
The harmonic series - a treacherous surprise
The terms decrease, decrease, decrease - one would think the sum must be finite. Here is the counterexample that breaks that intuition:
The terms approach zero: $\frac{1}{n} \to 0$. It seems the series should converge. **No.** The harmonic series diverges - its sum is infinite. This is the most important counterexample in mathematical analysis. Commit it to memory.
Proof of divergence
Oresme's grouping method (14th century!)
Group the terms: $1 + \frac{1}{2} + \left(\frac{1}{3} + \frac{1}{4}\right) + \left(\frac{1}{5} + \frac{1}{6} + \frac{1}{7} + \frac{1}{8}\right) + \ldots$ Estimate each group from below: - $\frac{1}{3} + \frac{1}{4} > \frac{1}{4} + \frac{1}{4} = \frac{1}{2}$ - $\frac{1}{5} + \frac{1}{6} + \frac{1}{7} + \frac{1}{8} > 4 \cdot \frac{1}{8} = \frac{1}{2}$ Every group is greater than $\frac{1}{2}$! Sum: $> 1 + \frac{1}{2} + \frac{1}{2} + \frac{1}{2} + \ldots = \infty$ Nicole Oresme proved this in the 14th century - 300 years before calculus was invented.
If the terms of a series approach zero, then the series converges
The condition $a_n \to 0$ is necessary but NOT sufficient for convergence
The harmonic series is the canonical counterexample: $\frac{1}{n} \to 0$, but the sum is infinite. Decreasing terms only mean convergence is not ruled out - nothing more. In ML terms: if the step size decays, convergence is not guaranteed. SGD needs additional conditions (Robbins-Monro: $\sum \eta_t = \infty$, $\sum \eta_t^2 < \infty$).
The harmonic series grows like $\ln n$. To exceed a sum of 10 takes about 12,000 terms. To exceed 20 takes about 270 million. But it will eventually exceed any number.
Generalized harmonic series (p-series)
| Condition | Result | Example |
|---|---|---|
| $p > 1$ | **Converges** | $\sum \frac{1}{n^2} = \frac{\pi^2}{6}$ |
| $p = 1$ | **Diverges** | Harmonic series |
| $p < 1$ | **Diverges** | $\sum \frac{1}{\sqrt{n}}$ - even worse |
The boundary is exactly at $p = 1$. At $p = 2$ the sum equals $\frac{\pi^2}{6}$ - the famous **Basel problem**, solved by Euler in 1735. This result appears in the variance of the Cauchy distribution, in atomic energy levels, and in extreme value statistics.
Which of the following series converges?
This is a p-series with p = 2 > 1, so it converges. All others diverge: harmonic (p = 1), with square root (p = 1/2 < 1), and the series with logarithm (terms decrease more slowly than harmonic).
Alternating series
Alternating series
The harmonic series diverges. But with alternating signs - it converges. Not obvious at all. Alternating signs "compensate" for the slow decay of terms - partial sums oscillate around the limit, each step closer:
| n | Operation | $S_n$ | Relative to ln2 |
|---|---|---|---|
| 1 | +1 | 1.000 | above ↑ |
| 2 | $-\frac{1}{2}$ | 0.500 | below ↓ |
| 3 | $+\frac{1}{3}$ | 0.833 | above ↑ |
| 4 | $-\frac{1}{4}$ | 0.583 | below ↓ |
| $\infty$ | 0.693 | $= \ln 2$ |
**Leibniz alternating series test**. If the terms of a series: 1. Alternate in sign: $a_1 - a_2 + a_3 - a_4 + \ldots$ 2. Decrease in absolute value: $|a_{n+1}| < |a_n|$ 3. Approach zero: $a_n \to 0$ Then the series **converges**.
**Danger of rearrangement!**. Rearranging the terms of a conditionally convergent series is not allowed! The series $1 - \frac{1}{2} + \frac{1}{3} - \frac{1}{4} + \ldots = \ln 2$, but rearranging the terms can produce **any number** or even infinity. This is the Riemann rearrangement theorem. In ML: shuffling data in batches does not change the loss sum - but rearranging series terms changes the sum.
Which series is alternating?
An alternating series has terms whose signs strictly alternate. The Leibniz criterion: it converges if $|a_n|$ monotonically decreases to 0.
Practice
Practice
Find the sum of the geometric series: $2 + 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \ldots$
First term: $a = 2$ Common ratio: $q = \frac{1}{2}$ (each next term is half the previous) $|q| = \frac{1}{2} < 1$ ✓ - series converges Sum: $S = \frac{a}{1-q} = \frac{2}{1 - 1/2} = \frac{2}{1/2} = 4$ **Answer: 4**
Prove that the series $\sum_{n=1}^{\infty} \frac{n}{n+1}$ diverges using the necessary condition for convergence.
Necessary condition for convergence: $\lim_{n \to \infty} a_n = 0$ Find the limit of the series term: $\lim_{n \to \infty} \frac{n}{n+1} = \lim_{n \to \infty} \frac{1}{1 + 1/n} = \frac{1}{1+0} = 1 \neq 0$ The terms approach 1, not 0! The necessary condition is **not satisfied** → the series **diverges**.
Investigate the convergence of the series $\sum_{n=1}^{\infty} \frac{(-1)^n}{\sqrt{n}}$ using the Leibniz test.
Series: $-\frac{1}{1} + \frac{1}{\sqrt{2}} - \frac{1}{\sqrt{3}} + \frac{1}{\sqrt{4}} - \ldots$ Check the Leibniz test conditions: **1. Alternating sign**: $(-1)^n$ gives alternating signs ✓ **2. Decreasing in absolute value**: $|a_n| = \frac{1}{\sqrt{n}}$ $|a_{n+1}| = \frac{1}{\sqrt{n+1}} < \frac{1}{\sqrt{n}} = |a_n|$ ✓ **3. Limit to zero**: $\lim_{n \to \infty} \frac{1}{\sqrt{n}} = 0$ ✓ All three conditions are satisfied! By the Leibniz test the series **converges**. (Note: the series $\sum \frac{1}{\sqrt{n}}$ without alternation diverges - it is a p-series with p = 1/2 < 1)
The series $\sum_{n=1}^{\infty} \frac{1}{n(n+1)}$ converges to what value?
Telescoping: $\frac{1}{n(n+1)} = \frac{1}{n} - \frac{1}{n+1}$, so partial sum $S_N = 1 - \frac{1}{N+1} \to 1$.
Connection with other topics
Series - a bridge between the discrete and the continuous
- Sequences — Partial sums form a sequence
- Limits of functions — The sum of a series is a limit
- Integrals — Integral test for convergence
- Taylor series — Expanding functions into series - how PyTorch computes sin(x)
Итоги
- **Series** - an infinite sum defined as the limit of partial sums $S_n \to S$
- **Geometric series**: $\sum q^n = \frac{1}{1-q}$ for $|q| < 1$ - the foundation of Adam, cosine decay, discounting in RL
- **Harmonic series** $\sum \frac{1}{n}$ diverges despite $a_n \to 0$: the most important counterexample in analysis
- **p-series**: converges for $p > 1$, diverges for $p \leq 1$. At $p = 2$: $\sum 1/n^2 = \pi^2/6$
- **Leibniz test**: alternating + decreasing + $a_n \to 0$ ⇒ converges
- **Watch out for rearrangements**: a conditionally convergent series can be reordered to any sum
Вопросы для размышления
- Why is the necessary condition ($a_n \to 0$) not sufficient?
- Adam uses $\beta_1 = 0.9$. What geometric series does this generate and what is its sum?
- How are series related to computing $\pi$, $e$, and other constants?
- Why can rearranging the terms of a conditionally convergent series change the sum?