Calculus
Continuity of a Function
Цели урока
- Verify continuity using three conditions: defined, limit exists, limit equals value
- Classify discontinuities: removable, jump, infinite, essential
- Apply the Intermediate Value Theorem to prove the existence of roots
- Understand the Weierstrass theorem: a closed interval guarantees max and min
- Distinguish between continuity and uniform continuity
Предварительные знания
- The concept of a function limit
- Computing limits
2018. The Google Brain team discovers: some activation functions produce NaN during backpropagation. The cause - a discontinuity in the derivative. That's how GELU was born: smooth, continuous, with no sharp corners. Continuity isn't an abstraction from a textbook. It's the condition under which gradients flow without blowups.
- **Activation functions**: ReLU is continuous but not differentiable at zero. GELU, SiLU, Mish are smooth replacements - precisely because continuity of the derivative improves training
- **Numerical methods**: the bisection method (scipy.optimize.bisect) finds roots only for continuous functions - the Intermediate Value Theorem in code
- **Loss landscapes**: 'well-behaved' loss functions are continuous and differentiable. Discontinuities = unstable training, exploding gradients
- **Signals**: an analog signal is continuous, a digital one is not. ADC/DAC convert between them; sampling introduces unavoidable errors (Nyquist theorem)
Cauchy formalizes intuition
Before the 19th century mathematicians used an intuitive understanding of continuity - 'without breaks and jumps'. **Cauchy** was the first to give a precise definition through limits in his 'Cours d'analyse' (1821). Later **Weierstrass** found a function that is continuous everywhere but differentiable nowhere. This shocked mathematicians - it turned out that the intuition of 'drawing without lifting the pencil' does not guarantee a derivative.
Three conditions for continuity
Three conditions for continuity
A function $f$ is **continuous at a point $a$** - this is shorthand for three requirements that must hold simultaneously:
- **$f(a)$ exists** - the function is defined at the point
- **$\lim_{x \to a} f(x)$ exists and is finite**
- **The limit equals the value** - these two numbers coincide
Violate even one of the three - and there's a discontinuity. Think of it as three requirements for a well-behaved loss function: it must be defined, have a limit for any input, and not jump. Violating any one leads to unstable training.
Checking continuity
f(x) = x² at x = 2
1. $f(2) = 4$ - defined ✓ 2. $\lim_{x \to 2} x^2 = 4$ - limit exists ✓ 3. $\lim_{x \to 2} x^2 = 4 = f(2)$ - equal ✓ **Conclusion**: $x^2$ is continuous at $x = 2$. Polynomials are continuous everywhere - no exceptions.
Function $g(x) = x + 1$ for $x \neq 0$, and $g(0) = 5$. Is it continuous at $x = 0$?
$g(0) = 5$, but $\lim_{x \to 0}(x+1) = 1 \neq 5$. All three conditions must hold simultaneously.
Types of discontinuities
Types of discontinuities
Discontinuities are classified into two kinds based on the behavior of one-sided limits. This distinction is critical for understanding why some functions can be 'fixed' and others cannot.
First-kind discontinuities: both limits are finite
| Type | Condition | Example | Fixable? |
|---|---|---|---|
| **Removable** | Left and right limits are equal, but $f(a)$ differs or is undefined | $\frac{\sin x}{x}$ at $x=0$ | Yes - redefine one point |
| **Jump** | Left and right limits differ: $L^- \neq L^+$ | $\text{sign}(x)$ at $x=0$ | No |
Removable discontinuity
A hole that can be patched
$f(x) = \frac{x^2 - 1}{x - 1}$ for $x \neq 1$ At $x = 1$: division by zero, function undefined. But the limit exists: $$\lim_{x \to 1} \frac{(x-1)(x+1)}{x-1} = \lim_{x \to 1} (x+1) = 2$$ Define $f(1) = 2$ and the function becomes continuous. A removable discontinuity in PyTorch is like NaN from division by zero where the limit is finite: add `eps` and the problem is solved.
Second-kind discontinuities: at least one limit is infinite
| Type | Condition | Example |
|---|---|---|
| **Infinite** | At least one limit $= \pm\infty$ | $\frac{1}{x}$ at $x=0$ |
| **Essential** | Limit does not exist due to oscillations | $\sin(1/x)$ at $x=0$ |
Second-kind discontinuities cannot be removed - the function behaves fundamentally wildly near the point. $\sin(1/x)$ as $x \to 0$ oscillates infinitely between -1 and 1, with no limit at all.
If a function is undefined at a point - it is necessarily a second-kind discontinuity
The type of discontinuity is determined by the limits, not the value at the point
$\frac{x^2-1}{x-1}$ is undefined at $x=1$, but the limit is finite - so it's a removable first-kind discontinuity. Type = behavior of the limit, not the function value.
What type of discontinuity does $f(x) = \lfloor x \rfloor$ (floor function) have at $x = 2$?
$\lim_{x \to 2^-} \lfloor x \rfloor = 1$, but $\lim_{x \to 2^+} \lfloor x \rfloor = 2$. Both limits are finite but different - a first-kind jump discontinuity.
Fundamental theorems
Fundamental theorems
Intermediate Value Theorem (Bolzano-Cauchy)
If $f$ is continuous on $[a, b]$ and changes sign ($f(a) \cdot f(b) < 0$), then there exists $\xi \in (a, b)$: $f(\xi) = 0$. This is exactly what `scipy.optimize.bisect` implements - the bisection method halves the interval, checks the sign, repeats.
Proving the existence of a root
x³ - x - 1 = 0
$f(x) = x^3 - x - 1$ - a polynomial, continuous everywhere. $f(1) = 1 - 1 - 1 = -1 < 0$ $f(2) = 8 - 2 - 1 = 5 > 0$ The function changes sign on $[1, 2]$ - by the theorem there exists $\xi \in (1, 2): f(\xi) = 0$. Numerically: $\xi \approx 1.3247$. This exact argument is what `bisect(f, 1, 2)` in scipy uses.
Weierstrass theorem
A function continuous on a **closed interval** $[a, b]$ is bounded and attains its exact bounds: there exist $x_1, x_2 \in [a,b]$ such that $f(x_1) = \min$, $f(x_2) = \max$.
**Closedness is critical.** On the interval $(0, 1)$ the function $f(x) = 1/x$ is continuous but unbounded - it goes to $+\infty$. Remove the closedness - the theorem collapses. In ML: loss on a finite dataset always attains a minimum (if the function is continuous).
Can we apply the Intermediate Value Theorem to $f(x) = 1/x$ on $[-1, 1]$?
$f(x) = 1/x$ has a discontinuity at $x = 0 \in [-1, 1]$. The theorem requires continuity on the entire interval without exception.
Uniform continuity
Uniform continuity
In ordinary continuity $\delta$ can depend on the point $a$: different points require different $\delta$ values. In **uniform continuity** a single $\delta$ works for all points simultaneously:
Non-uniform continuity
f(x) = 1/x on (0, 1)
$f(x) = 1/x$ is continuous on $(0, 1)$, but NOT uniformly so. Near $x = 0$ the function becomes infinitely steep. For the same precision $\varepsilon$ near $x = 0.001$, a $\delta$ thousands of times smaller is needed than near $x = 0.5$. No single $\delta$ exists. **Cantor's theorem**: continuous on a closed $[a, b]$ means uniformly continuous there. Again, closedness solves everything.
Continuity and uniform continuity are the same thing
Uniform continuity is a stronger property. Continuous ≠ uniformly continuous.
Uniform continuity requires a single $\delta$ for all points. On non-closed sets this may fail even for ordinary continuous functions. For example $1/x$ on $(0,1)$.
Which function is uniformly continuous on its domain?
$\sin$ has bounded derivative, hence Lipschitz with constant 1, hence uniformly continuous. $1/x$ on $(0,1)$ and $x^2$ on $\mathbb{R}$ are continuous but not uniformly.
Practice
Practice
Determine the type of discontinuity of $f(x) = \frac{|x|}{x}$ at $x = 0$
For $x > 0$: $|x|/x = 1$. For $x < 0$: $|x|/x = -x/x = -1$. $\lim_{x \to 0^+} f(x) = 1$ $\lim_{x \to 0^-} f(x) = -1$ Both limits are finite but different - **first-kind discontinuity (jump)**.
Prove that the equation $\cos x = x$ has a solution on $[0, 1]$
Let $g(x) = \cos x - x$. Continuous as a difference of continuous functions. $g(0) = 1 > 0$ $g(1) = \cos 1 - 1 \approx -0.46 < 0$ By the Intermediate Value Theorem $\exists \xi \in (0, 1): g(\xi) = 0$, meaning $\cos \xi = \xi$. Numerically: $\xi \approx 0.739$ - the fixed point of cosine.
For what value of $a$ is the function $f(x) = x^2$ for $x \leq 1$ and $f(x) = ax + b$ for $x > 1$ continuous at $x = 1$, if $b = 0$?
From the left piece: $f(1) = 1$. Right-hand limit: $\lim_{x \to 1^+} (ax) = a$. For continuity: $a = 1$. Verification: $f(x) = x^2$ for $x \leq 1$ and $f(x) = x$ for $x > 1$. $\lim_{x \to 1^-} x^2 = 1$, $\lim_{x \to 1^+} x = 1$, $f(1) = 1$ - all three conditions hold.
If $f$ is continuous on $[0, 1]$ with $f(0) = -2$ and $f(1) = 3$, which conclusion follows?
Intermediate Value Theorem: a continuous function on $[a,b]$ takes every value between $f(a)$ and $f(b)$. Since 0 is between -2 and 3, some $c$ gives $f(c) = 0$.
Connection with other topics
Continuity is the foundation for the derivative and everything above
- Derivative — Differentiability implies continuity - but not the other way. ReLU is continuous but not differentiable at zero.
- Integral — A continuous function is Riemann integrable - this is the most important sufficient condition
- Numerical methods — The bisection method (scipy.optimize.bisect) requires continuity - otherwise there's no guarantee of finding a root
Итоги
- Continuity at a point: three conditions simultaneously - defined, limit exists, limit = value
- First-kind discontinuities (both limits finite): removable (equal) and jump (different)
- Second-kind discontinuities (at least one limit infinite): infinite and essential - cannot be removed
- Intermediate Value Theorem: continuous function changes sign - there is a root. Foundation of the bisection method
- Weierstrass theorem: on a closed interval a continuous function is bounded and attains max/min
Вопросы для размышления
- Why does ReLU, despite being non-differentiable at zero, work successfully in neural networks? How does this relate to continuity?
- Why is closedness of the interval so important for the Weierstrass and Cantor theorems?
- How does the Intermediate Value Theorem underlie scipy.optimize.bisect?