Statistics

Kalman Filter, ARCH/GARCH, and Cointegration

Financial markets shift in level, and their volatility shifts too. Engle's ARCH model (1982) and Bollerslev's GARCH (1986) capture this: volatility itself is nonstationary. A rocket cannot see its own position directly; it estimates through noisy sensor readings. Advanced time series models describe exactly this dynamic, uncertain reality.

  • GPS and aviation: the Kalman filter fuses noisy accelerometer and GPS signals to estimate position
  • Options trading: GARCH forecasts volatility for derivative pricing
  • Pairs trading (stat arb): cointegration identifies a stationary spread for mean-reversion strategies

Предварительные знания

  • Causal Inference

State Space Models and the Kalman Filter

A **State Space Model (SSM)** separates a hidden state xₜ from an observation yₜ: xₜ = Aₜxₜ₋₁ + Bₜuₜ + wₜ (state equation), yₜ = Cₜxₜ + Dₜuₜ + vₜ (observation equation), with wₜ ~ N(0,Q) and vₜ ~ N(0,R). The **Kalman filter** is the optimal Bayesian estimator for xₜ under linearity and Gaussian noise. Prediction: x̂ₜ|ₜ₋₁ = Ax̂ₜ₋₁, Pₜ|ₜ₋₁ = APₜ₋₁Aᵀ + Q. Update: Kₜ = Pₜ|ₜ₋₁Cᵀ(CPₜ|ₜ₋₁Cᵀ+R)⁻¹, x̂ₜ = x̂ₜ|ₜ₋₁ + Kₜ(yₜ − Cx̂ₜ|ₜ₋₁).

**Kalman filter extensions:** the Extended Kalman Filter (EKF) linearizes nonlinear systems via the Jacobian; the Unscented KF (UKF) uses sigma-points for better approximation; particle filters use Monte Carlo for arbitrary nonlinear/non-Gaussian systems. SSMs underlie structural time series models (Prophet, statsmodels structural_ts) and hidden Markov models (HMM).

In the Kalman filter, K = P·Cᵀ(C·P·Cᵀ + R)⁻¹. What happens when R → 0 (very precise measurements)?

ARCH/GARCH: Volatility Models

**ARCH(q)** (Autoregressive Conditional Heteroskedasticity): σ²ₜ = α₀ + ∑ᵢ αᵢ ε²ₜ₋ᵢ - volatility depends on past squared residuals. **GARCH(p,q):** σ²ₜ = α₀ + ∑αᵢε²ₜ₋ᵢ + ∑βⱼσ²ₜ₋ⱼ - adds volatility persistence. GARCH(1,1) is ubiquitous in finance: σ²ₜ = ω + αε²ₜ₋₁ + βσ²ₜ₋₁, where α+β < 1 for stationarity. Unconditional variance: σ² = ω/(1−α−β).

**GARCH extensions:** EGARCH captures the asymmetry (leverage effect: bad news increases volatility more than good news). GJR-GARCH: σ²ₜ = ω + (α+γ·Iₜ₋₁)ε²ₜ₋₁ + βσ²ₜ₋₁, Iₜ₋₁=1 when εₜ₋₁<0. GARCH-M: risk premium - expected return depends on current volatility. DCC-GARCH (Dynamic Conditional Correlation): multivariate volatility for portfolios.

GARCH(1,1): α=0.12, β=0.85. Unconditional volatility σ = √(ω/(1−α−β)) = 1.5%. Today there was a sharp move: σ²_t=25 (σ=5%). What is the expected volatility tomorrow, σ²_{t+1}?

Unit Roots and Cointegration

A **unit root** makes the series yₜ = φyₜ₋₁ + εₜ nonstationary when φ=1 (random walk). The **Dickey-Fuller test** has H₀: φ=1. The test statistic τ = (φ̂−1)/SE(φ̂) has a non-standard distribution. Integration order I(d): the series must be differenced d times to achieve stationarity. **Cointegration:** nonstationary series Xₜ~I(1) and Yₜ~I(1) are cointegrated if ∃β: Yₜ − βXₜ ~ I(0). This is a long-run equilibrium relationship. The Johansen test determines the number of cointegrating vectors.

**Spurious regression:** regressing two independent random walks produces a high R² and significant t-statistics - this is an artifact of nonstationarity. The rule: if both series are I(1) and cointegration is not confirmed, do not interpret regression coefficients. First-differencing removes the unit root but destroys long-run relationships. The correct approach is an Error Correction Model (ECM) after cointegration has been confirmed.

Regressing Yₜ on Xₜ gave R²=0.87, p<0.001. Both series are I(1); the Engle-Granger test on residuals gave τ=−1.5 (critical value −3.4). What is the correct conclusion?

Key ideas

  • SSM: xₜ=Axₜ₋₁+wₜ, yₜ=Cxₜ+vₜ; Kalman filter is the optimal linear estimator of xₜ
  • GARCH(1,1): σ²ₜ=ω+αε²ₜ₋₁+βσ²ₜ₋₁; unconditional σ²=ω/(1−α−β)
  • ADF test: H₀: unit root; τ < critical value -> I(0)
  • Cointegration: Yₜ−βXₜ~I(0) when Xₜ,Yₜ~I(1) -> long-run equilibrium
  • Spurious regression: nonstationary series without cointegration give false R² and p-values

Time series and the course

State Space Models generalize many time series: ARIMA, structural models, DLM. GARCH is linked to conditional heteroskedasticity and heavy tails. Cointegration applies integration theory to multivariate systems.

  • Variational Bayesian Inference — Bayesian SSMs use variational inference to estimate parameters with nonlinear transitions
  • Graphical Models — SSMs are a special case of HMMs, which are directed graphical models

Вопросы для размышления

  • In the Kalman filter, the gain Kₜ is called the 'trust weight for the observation'. Under what conditions (ratio of Q to R) does the filter trust observations more, and when does it trust the model more? How does this relate to ridge regression?
  • Why does volatility clustering (large price changes cluster together) violate the i.i.d. residual assumption in ordinary time series models? What are the consequences for Value at Risk?
  • Cointegration enables trading the 'spread' between two nonstationary assets. How would a pairs trading strategy change if the cointegration parameter β is time-varying?

Связанные уроки

  • prob-17
Kalman Filter, ARCH/GARCH, and Cointegration

0

1

Sign In