Stochastic Processes

Rough Paths Theory

Цели урока

Explain why the Stieltjes integral fails for paths with infinite 1-variation
Construct a level-2 rough path and verify Chen's identity
State Lyons' continuity theorem and its consequences
Connect rough paths theory to Neural CDE and ML on time series

Предварительные знания

SPDEs
Stochastic integrals
Brownian motion

A medical device measures heart rate every 30 seconds, but sometimes every 5 minutes. LSTM requires a regular grid. Neural CDE on rough paths does not. Accuracy difference: 3.5%. Lyons' 1998 mathematics inside a 2024 medical device.

DeepMind: Neural CDE for irregular medical time series (PhysioNet)
Hairer 2014: Fields Medal for rough paths applied to SPDEs (KPZ equation)
torchcde: Neural CDE library on PyTorch
Finance: irregular tick data via rough paths without resampling

Lyons, Hairer, and determinizing stochastics

Terry Lyons developed rough paths theory in the 1990s; the 1998 paper in Rev. Mat. Iberoamericana is foundational. Peter Fritz and Nicolas Victoir systematized the theory in their 2010 monograph. Martin Hairer in 2013-14 applied rough paths to SPDEs (regularity structures), solving the KPZ equation - a nonlinear SPDE where white noise is too rough. Fields Medal 2014. Patrick Kidger brought the ideas to ML in 2020-21.

p-Variation and the Integration Problem

1998. Terry Lyons publishes in Revista Matematica Iberoamericana. The problem: define an integral along a path with infinite 1-variation - for example, Brownian motion. Ito solved it probabilistically. Lyons solved it deterministically, by adding one number. That number changed everything.

The Levy area X_{s,t}^{12} = (1/2)((W^1_t - W^1_s)(W^2_t - W^2_s) - integral dW^1 dW^2) is the non-commutative characteristic of a Brownian path. It is precisely what distinguishes the Ito integral from the Stratonovich integral.

p-variation of Brownian motion is finite for p > 2. This matters: for p < 3 one level of iterated integrals (second-order rough path) is needed. For p < 4: two levels. In general: N levels for p < N+1.

Why is a Riemann sum insufficient for integrating along a Brownian path?

||W||_{1-var} = inf almost surely for Brownian motion. The Stieltjes integral breaks down. Defining int f(W) dW requires either a stochastic (Ito/Stratonovich) or a rough-path approach.

Rough Paths and Lyons' Continuity Theorem

The main result of the theory: the Ito map (rough path -> SDE solution) is continuous. In stochastic analysis this is almost impossible: the solution depends on the entire history of the noise, which is random. Lyons showed that knowing the path and its Levy area is enough - the solution is then stable.

Lyons' continuity theorem has an unexpected consequence: Brownian motion can be approximated by smooth paths, and SDE solutions will converge. This is the Wong-Zakai theorem - rigorous justification of numerical SDE solvers.

What does Chen's identity express for a rough path?

Chen's identity: X_{s,t} = X_{s,u} + X_{u,t} + (X_u - X_s) tensor (X_t - X_u). This is the necessary algebraic condition that the second level of a rough path must satisfy for the integral to exist.

Rough Paths in Machine Learning

Hairer received the 2014 Fields Medal for applying rough paths to SPDEs. But the ML community noticed later: DeepMind uses rough paths for learning on irregular time series from medical data - ECGs, accelerometers, financial tick streams with gaps.

Patrick Kidger (Oxford, 2021) showed: Neural CDE is the continuous-time version of LSTM. The hidden state updates not per token but along the rough path of the data. This enables irregular time series without interpolation or zero-padding.

Neural CDE for medical time series

ECG classification with missing measurements

PhysioNet data: 36 physiological parameters at irregular intervals. LSTM requires a fixed time step - interpolation is needed and information about the time gaps is lost. Neural CDE: data is encoded as a rough path via natural cubic spline, integration via Lyons. Accuracy is 3.5% better than LSTM at the same number of parameters.

For implementing Neural CDE use the torchcde library (Patrick Kidger, PyTorch). Key object: torchcde.NaturalCubicSpline for building the rough path from irregular observations.

What advantage does Neural CDE have over LSTM for irregular time series?

Neural CDE parameterizes the hidden state as a controlled ODE: dZ = f_theta(Z) d(cubic spline from data). The rough path provides robustness and natural handling of irregular time grids.

Connections to other topics

Rough paths theory connects stochastic analysis, path topology, and ML

Signature methods — Related topic
Neural ODE / CDE — Related topic
SPDE (KPZ) — Related topic
Wong-Zakai theorem — Related topic

Итоги

Brownian path: ||W||_{1-var} = inf - the classical Stieltjes integral does not work
Level-2 rough path: pair (X, X), where X_{s,t} = iterated integral (Levy area)
Chen's identity: algebraic consistency condition between path levels
Lyons' continuity theorem: X -> SDE solution is continuous in p-variation topology

Вопросы для размышления

Why do different integral definitions (Ito vs Stratonovich) correspond to different rough paths?
How does the continuity theorem explain the robustness of Neural CDE to missing data?
What does 'lifting' Brownian motion to a rough path mean, and why is the lift non-unique?

Связанные уроки

sp-26-spde — Rough paths resolve the regularity problems of SPDEs
sp-28 — Signature methods are built on rough path theory
sp-20 — Brownian motion is the primary example of a rough path