Statistics
Kalman Filter, ARCH/GARCH, and Cointegration
Financial markets shift in level, and their volatility shifts too. Engle's ARCH model (1982) and Bollerslev's GARCH (1986) capture this: volatility itself is nonstationary. A rocket cannot see its own position directly; it estimates through noisy sensor readings. Advanced time series models describe exactly this dynamic, uncertain reality.
- GPS and aviation: the Kalman filter fuses noisy accelerometer and GPS signals to estimate position
- Options trading: GARCH forecasts volatility for derivative pricing
- Pairs trading (stat arb): cointegration identifies a stationary spread for mean-reversion strategies
Предварительные знания
State Space Models and the Kalman Filter
A **State Space Model (SSM)** separates a hidden state xₜ from an observation yₜ: xₜ = Aₜxₜ₋₁ + Bₜuₜ + wₜ (state equation), yₜ = Cₜxₜ + Dₜuₜ + vₜ (observation equation), with wₜ ~ N(0,Q) and vₜ ~ N(0,R). The **Kalman filter** is the optimal Bayesian estimator for xₜ under linearity and Gaussian noise. Prediction: x̂ₜ|ₜ₋₁ = Ax̂ₜ₋₁, Pₜ|ₜ₋₁ = APₜ₋₁Aᵀ + Q. Update: Kₜ = Pₜ|ₜ₋₁Cᵀ(CPₜ|ₜ₋₁Cᵀ+R)⁻¹, x̂ₜ = x̂ₜ|ₜ₋₁ + Kₜ(yₜ − Cx̂ₜ|ₜ₋₁).
**Kalman filter extensions:** the Extended Kalman Filter (EKF) linearizes nonlinear systems via the Jacobian; the Unscented KF (UKF) uses sigma-points for better approximation; particle filters use Monte Carlo for arbitrary nonlinear/non-Gaussian systems. SSMs underlie structural time series models (Prophet, statsmodels structural_ts) and hidden Markov models (HMM).
In the Kalman filter, K = P·Cᵀ(C·P·Cᵀ + R)⁻¹. What happens when R → 0 (very precise measurements)?
ARCH/GARCH: Volatility Models
**ARCH(q)** (Autoregressive Conditional Heteroskedasticity): σ²ₜ = α₀ + ∑ᵢ αᵢ ε²ₜ₋ᵢ - volatility depends on past squared residuals. **GARCH(p,q):** σ²ₜ = α₀ + ∑αᵢε²ₜ₋ᵢ + ∑βⱼσ²ₜ₋ⱼ - adds volatility persistence. GARCH(1,1) is ubiquitous in finance: σ²ₜ = ω + αε²ₜ₋₁ + βσ²ₜ₋₁, where α+β < 1 for stationarity. Unconditional variance: σ² = ω/(1−α−β).
**GARCH extensions:** EGARCH captures the asymmetry (leverage effect: bad news increases volatility more than good news). GJR-GARCH: σ²ₜ = ω + (α+γ·Iₜ₋₁)ε²ₜ₋₁ + βσ²ₜ₋₁, Iₜ₋₁=1 when εₜ₋₁<0. GARCH-M: risk premium - expected return depends on current volatility. DCC-GARCH (Dynamic Conditional Correlation): multivariate volatility for portfolios.
GARCH(1,1): α=0.12, β=0.85. Unconditional volatility σ = √(ω/(1−α−β)) = 1.5%. Today there was a sharp move: σ²_t=25 (σ=5%). What is the expected volatility tomorrow, σ²_{t+1}?
Unit Roots and Cointegration
A **unit root** makes the series yₜ = φyₜ₋₁ + εₜ nonstationary when φ=1 (random walk). The **Dickey-Fuller test** has H₀: φ=1. The test statistic τ = (φ̂−1)/SE(φ̂) has a non-standard distribution. Integration order I(d): the series must be differenced d times to achieve stationarity. **Cointegration:** nonstationary series Xₜ~I(1) and Yₜ~I(1) are cointegrated if ∃β: Yₜ − βXₜ ~ I(0). This is a long-run equilibrium relationship. The Johansen test determines the number of cointegrating vectors.
**Spurious regression:** regressing two independent random walks produces a high R² and significant t-statistics - this is an artifact of nonstationarity. The rule: if both series are I(1) and cointegration is not confirmed, do not interpret regression coefficients. First-differencing removes the unit root but destroys long-run relationships. The correct approach is an Error Correction Model (ECM) after cointegration has been confirmed.
Regressing Yₜ on Xₜ gave R²=0.87, p<0.001. Both series are I(1); the Engle-Granger test on residuals gave τ=−1.5 (critical value −3.4). What is the correct conclusion?
Key ideas
- SSM: xₜ=Axₜ₋₁+wₜ, yₜ=Cxₜ+vₜ; Kalman filter is the optimal linear estimator of xₜ
- GARCH(1,1): σ²ₜ=ω+αε²ₜ₋₁+βσ²ₜ₋₁; unconditional σ²=ω/(1−α−β)
- ADF test: H₀: unit root; τ < critical value -> I(0)
- Cointegration: Yₜ−βXₜ~I(0) when Xₜ,Yₜ~I(1) -> long-run equilibrium
- Spurious regression: nonstationary series without cointegration give false R² and p-values
Time series and the course
State Space Models generalize many time series: ARIMA, structural models, DLM. GARCH is linked to conditional heteroskedasticity and heavy tails. Cointegration applies integration theory to multivariate systems.
- Variational Bayesian Inference — Bayesian SSMs use variational inference to estimate parameters with nonlinear transitions
- Graphical Models — SSMs are a special case of HMMs, which are directed graphical models
Вопросы для размышления
- In the Kalman filter, the gain Kₜ is called the 'trust weight for the observation'. Under what conditions (ratio of Q to R) does the filter trust observations more, and when does it trust the model more? How does this relate to ridge regression?
- Why does volatility clustering (large price changes cluster together) violate the i.i.d. residual assumption in ordinary time series models? What are the consequences for Value at Risk?
- Cointegration enables trading the 'spread' between two nonstationary assets. How would a pairs trading strategy change if the cointegration parameter β is time-varying?