Numerical Methods
Numerical Methods in Interviews
Google lost millions when a trading algorithm crashed due to division by zero. The Ariane 5 rocket exploded from an int-to-float overflow. Tesla Autopilot had detection failures from NaN in neural networks. Numerical error is not an academic concern-it is a real production risk that ends careers and destroys hardware.
- **Patriot missile miss (1991):** accumulated fp rounding error in the system clock - 0.34 s drift after 100 hours → 687-meter miss; 28 casualties; root cause: fp truncation instead of rounding
- **Vancouver Stock Exchange (1982):** index recalculated with truncation instead of rounding - after 22 months a 574-point error had accumulated out of 1000; corrected overnight when discovered
- **PyTorch gradient debugging:** torch.autograd.set_detect_anomaly(True) is the standard first tool used at Google Brain and OpenAI to localize NaN in experimental architectures
Предварительные знания
Floating-Point Traps
Floating-point numbers are not real numbers. IEEE 754 defines exact rules for addition, multiplication, and comparison. Ignoring those rules produces classic FAANG interview mistakes: wrong comparisons, broken associativity, and silent precision loss.
**Catastrophic cancellation:** Idea: when a and b are close, a − b loses significant bits. Example: a = 1.000001, b = 1.000000, difference = 0.000001 - In fp32 (7 significant digits): a − b = 1e-6, but the mantissa bits were already used for the leading 1s → only 1 significant bit remains! **Associativity failure:** (a + b) + c ≠ a + (b + c) in fp32! Example: (1e20 + (−1e20)) + 1 = 0 + 1 = 1 1e20 + (−1e20 + 1) = 1e20 + (−1e20) = 0 ← 1 is lost! **Machine epsilon:** fp32: ε ≈ 1.19 × 10⁻⁷ fp64: ε ≈ 2.22 × 10⁻¹⁶ ε is the smallest number such that 1.0 + ε ≠ 1.0
**Classic interview question:** why does `0.1 + 0.2 != 0.3`? Answer: 0.1 and 0.2 are not exactly representable in binary floating-point, and their rounded sum differs from the rounded representation of 0.3. Never compare floats with `==`-always use `abs(a-b) < eps` or `math.isclose(a, b)`.
What are the results of (1e20 + (−1e20)) + 1 and 1e20 + (−1e20 + 1) in fp32?
Condition Number and Numerical Rank
The condition number κ(A) = ||A|| · ||A⁻¹|| measures how much errors in the input are amplified when solving Ax = b. Ill-conditioned matrices (κ >> 1) are the leading cause of precision loss in numerical computations.
**Condition number:** κ(A) = σ_max / σ_min (ratio of extreme singular values) **Interpretation:** if κ(A) ≈ 10ᵏ, one lose roughly k decimal digits of accuracy in the solution. fp64 (15 digits): at κ = 10¹² only 3 significant digits remain! fp32 (7 digits): at κ = 10⁵ all accuracy is gone. **Numerical rank:** The theoretical rank may be n, but in practice a matrix can be rank-deficient due to noise or near-singularity. Numerical rank = number of singular values above threshold tol = ε · σ_max **Regularization (Tikhonov / ridge):** (AᵀA + λI)x = Aᵀb-adding λ to the diagonal improves κ
Matrix A has κ(A) = 10¹⁰. one solve Ax = b in fp64 (machine epsilon ≈ 2.2×10⁻¹⁶). How many significant digits can one trust in the solution?
Debugging NaN/Inf in Production
NaN/Inf in an ML pipeline is one of the nastiest bugs: they propagate silently through every subsequent operation. Diagnosis requires understanding the sources of numerical instability and knowing which tools to reach for first.
**Common NaN sources in neural networks:** 1. **log(0):** log-softmax, cross-entropy, KL divergence when probabilities are zero 2. **0/0:** batch normalization on an empty batch; dividing by a zero norm 3. **sqrt(0):** in layer normalization gradients when inputs are zero 4. **Gradient explosion:** without clipping → Inf → NaN in the optimizer 5. **fp16 overflow:** activations > 65504 → Inf → NaN after subsequent ops **NaN propagation rules (IEEE 754):** - NaN + x = NaN - NaN * 0 = NaN (not 0!) - NaN > x = False ← silent trap in conditionals - NaN == NaN = False ← use np.isnan() or torch.isnan()
Training starts normally. After 1000 steps the loss becomes NaN. What is the first diagnostic step?
Choosing a Numerical Method: an Interview Framework
FAANG interviewers often ask: "Which method would one use for problem X?" The correct answer is not a specific algorithm-it is a structured analysis: problem size, matrix structure, accuracy requirements, and memory or time constraints.
**Method selection framework:** **Problem: Ax = b** - Dense A, n < 1000 → LU (scipy.linalg.solve) - Sparse A → sparse LU (scipy.sparse.linalg.spsolve) - SPD A → Cholesky (scipy.linalg.cholesky) or CG - Huge sparse SPD A → CG with preconditioning - Ill-conditioned → regularization + lstsq **Problem: nonlinear equation f(x) = 0** - Derivative available, good initial guess → Newton - No derivative → Brent (scipy.optimize.brentq) - System of nonlinear equations → scipy.optimize.fsolve **Problem: integration** - Smooth, low dimension → Gauss-Legendre - High dimension (d > 10) → Monte Carlo - Adaptive → scipy.integrate.quad **Problem: ODE** - Non-stiff → RK45 - Stiff → Radau / BDF - Unknown stiffness → LSODA
| Problem | Condition | Method | Cost |
|---|---|---|---|
| Ax=b dense | General A | LU (scipy.linalg.solve) | O(n³) |
| Ax=b dense | A = Aᵀ > 0 | Cholesky | O(n³/6) |
| Ax=b sparse | Any | SparseLU / CG | O(nnz) |
| f(x)=0 scalar | Derivative known | Newton | O(k log k) |
| f(x)=0 scalar | No derivative | Brent | O(log(1/ε)) |
| ODE | Non-stiff | RK45 | O(n · N_steps) |
| ODE | Stiff | Radau/BDF | O(n³ · N_steps) |
| Integral | d > 10 | Monte Carlo | O(1/√N) |
one need to solve a system of 10⁶ linear equations. The coefficient matrix is symmetric positive definite and sparse (nnz << n²). Which method?
Key Ideas
- **Catastrophic cancellation:** a − b when a ≈ b loses significant bits; fix with stable formulas - log-sum-exp, numerically stable quadratic root
- **κ(A) = σ_max/σ_min:** one lose log₁₀(κ) decimal digits; at κ > 10¹⁵ in fp64 the solution of Ax = b is meaningless
- **NaN debugging:** detect_anomaly + gradient norm monitoring; root causes - log(0), 0/0, fp16 overflow, gradient explosion
- **Method selection:** SPD + sparse → CG; stiff ODE → Radau/BDF; high dimension → Monte Carlo; no derivative → Brent
Related Topics
Interview problems draw on the full course:
- Iterative Methods — CG and GMRES are the correct answers to large sparse system questions-the theory was built in nm-07
- Numerical Methods in Machine Learning — NaN debugging, fp16 pitfalls, and autodiff are direct applications of the previous lesson
Вопросы для размышления
- Kahan summation reduces accumulation error from O(nε) to O(ε) regardless of n - how does it work, and why can compiler optimizations silently break it?
- Why is linalg.lstsq safer than linalg.solve(AᵀA, Aᵀb) for least-squares problems - what happens to the condition number when one form AᵀA?
- Under what circumstances does Monte Carlo beat Gaussian quadrature even for a one-dimensional integral - and at what N does it start losing to adaptive quadrature?