Dynamical Systems

Chaos and Strange Attractors

1963. Edward Lorenz. 3 equations. Accidentally discovers the butterfly effect when restarting with rounded numbers. Lorenz attractor: initial conditions with error 10^(-6). After 40 time units - a completely different trajectory. Lyapunov exponent lambda1 ~ 0.9: error doubles every ~0.77 time units. 2018: reservoir computing predicts the attractor for 8 Lyapunov times. Echo State Networks work best when spectral radius ~ 1 - at the edge of chaos. Not a metaphor. The maximum Lyapunov exponent literally -> 0. The same mathematics, 55 years later.

  • **Reservoir computing (Pathak 2018):** ML predicts the Lorenz attractor for 8 Lyapunov times - close to the theoretical limit
  • **Echo State Networks:** spectral radius ~ 1 = edge of chaos = maximal RNN memory without vanishing gradient
  • **Meteorology:** ensemble forecasting (50-100 parallel runs) - direct application of Lyapunov theory; horizon ~ 10 days
  • **EEG and neuroscience:** Lyapunov exponents for epilepsy diagnosis - chaotic brain activity differs from normal

Предварительные знания

  • Bifurcations

What is Chaos: Devaney's Three Conditions

**Lorenz attractor: initial conditions with error 10^(-6). After 40 time units - a completely different trajectory.** Lyapunov exponent lambda1 ~ 0.9: the error doubles every ~0.77 time units. That is the mathematical definition of 'butterfly' - not a metaphor, a number.

**Chaos in Devaney's sense** requires three conditions simultaneously: 1) **topological transitivity** - every trajectory visits a neighborhood of every point; 2) **density of periodic orbits** - near-periodic trajectories exist everywhere; 3) **sensitivity to initial conditions** - nearby trajectories diverge exponentially. The third condition is what makes chaos practically important.

**Sensitivity to initial conditions:** |delta_x(t)| ~ |delta_x(0)| * e^(lambda*t), where lambda > 0 is the Lyapunov exponent. **Prediction horizon:** T_pred ~ (1/lambda) * ln(L/delta_x0). For the atmosphere lambda ~ 0.35 day^(-1), L/delta_x0 ~ 10^5, so T_pred ~ 33 days - the theoretical maximum for weather forecasting, regardless of computing power.

Behavior typeDivergencePredictability
Stable equilibriumDecays exponentiallyAbsolute
Limit cycleDecays across the cycleLong-term
QuasiperiodicLinear growthLimited
ChaosExponential growth (lambda > 0)Horizon T ~ 1/lambda

If measurement accuracy improves by a factor of 1000, the prediction horizon of a chaotic system:

The Lorenz Attractor: Three Equations, One Fractal

**1963. Edward Lorenz. Three equations on a punch card. Restarted a simulation with rounded numbers - and after a few simulated 'months' the forecast diverged completely.** The computer was fine, the equations were correct. The problem was in the nature of the system itself. Deterministic chaos had been discovered.

**The Lorenz system** - a simplified model of atmospheric convection: **dx/dt = sigma*(y-x), dy/dt = x*(rho-z)-y, dz/dt = x*y - beta*z**. Classic parameters: sigma=10, rho=28, beta=8/3. The attractor is a fractal 'butterfly' structure that never repeats. **Attractor dimension ~ 2.06 - non-integer.** Lyapunov spectrum: lambda1 ~ 0.906, lambda2 = 0, lambda3 ~ -14.57. Sum < 0 - the attractor is dissipative.

Parameter rhoBehavior
rho < 1All trajectories -> 0
1 < rho < 24.74Two stable equilibria C+ and C-
rho = 24.74Bifurcation: equilibria lose stability
rho > 24.74Chaotic Lorenz attractor (at sigma=10, beta=8/3)

Lorenz and the Birth of Chaos Science

Edward Lorenz published 'Deterministic Nonperiodic Flow' in 1963. The paper was ignored by most physicists - the idea of unpredictability in deterministic systems seemed too revolutionary. Recognition came in the 1970s, when Ruelle and Takens introduced the term 'strange attractor'. In 2018, Pathak et al. showed that reservoir computing (Echo State Networks) can predict the Lorenz attractor for 8 Lyapunov times - a record at the time, close to the theoretical limit.

Why is the Lorenz attractor called 'strange'?

Routes to Chaos: From Order to Fractal

Chaos does not appear suddenly - it emerges through a sequence of bifurcations. Three main routes from order to chaos:

RouteMechanismExample
Period-doubling (Feigenbaum)Cycle-2 -> cycle-4 -> cycle-8 -> ... -> chaosLogistic map as r -> 3.57
Quasiperiodic (Ruelle-Takens)Torus breaks down into a strange attractorHydrodynamic turbulence
Intermittency (Pomeau-Manneville)Regular bursts within chaosNeural spikes, economic crises

**Feigenbaum constant delta ~ 4.669** - the ratio of successive period-doubling interval lengths. Universal for all one-dimensional maps with a quadratic maximum. Discovered by Feigenbaum in 1978 - the first example of universality in chaos theory.

**Stretching and folding mechanism** - the universal mechanism of chaos. The attractor is created by two processes: stretching (nearby points diverge, lambda > 0) and folding (the attractor remains bounded). Together they create fractal structure. Smale's horseshoe map (1960) is the mathematically clean example.

A chaotic system is deterministic. Why can't it be predicted over long timescales?

ML and Chaos: Reservoir Computing, Edge of Chaos

**Echo State Networks (ESN, Jaeger 2001)** - recurrent networks with a fixed random reservoir. Only the output layer is trained. Key parameter: **spectral radius rho of the reservoir.** When rho < 1 - the system decays, no memory. When rho >> 1 - chaos, information is lost. Optimal: **rho ~ 1, edge of chaos.** This is not a metaphor. Literally: the maximum Lyapunov exponent -> 0.

**Pathak et al. 2018 (Nature):** reservoir computing predicts the Lorenz attractor for **8 Lyapunov times** - several times better than previous methods. Lyapunov time = 1/lambda1 ~ 1/0.9 ~ 1.1 Lorenz time units. 8 times ~ 9 units. With error 10^(-6) the formula T_pred = (1/lambda)*ln(1/delta) ~ 1.1 * 14 ~ 15 units - ML approaches the theoretical limit.

**Connection to neural networks:** gradient vanishing/exploding in RNNs is the same problem. During backpropagation-through-time, the gradient is multiplied by the Jacobian of the recurrent layer at each step. If the spectral radius of the Jacobian is > 1 - gradient explodes (chaotic regime). If < 1 - vanishes (decaying). LSTM and GRU are engineering solutions that hold the effective spectral radius near 1.

Chaotic systems are unpredictable because they are random

Chaotic systems are fully deterministic but exhibit exponential sensitivity to initial conditions (lambda > 0). Randomness is not needed - unpredictability arises from the structure of the nonlinear equations.

The key distinction: a random system would give a different result if rerun with the same initial conditions. A chaotic system would not. Running the Lorenz system twice from x(0) = 1.000000 gives identical results. But from x(0) = 1.000001, after 50 time units the results diverge completely.

Why does an Echo State Network perform best at spectral radius ~ 1?

Key Takeaways

  • **Chaos = determinism + exponential sensitivity:** |delta_x(t)| ~ |delta_x(0)| * e^(lambda*t), lambda > 0
  • **Lorenz attractor:** 3 equations, fractal dimension ~ 2.06, lambda1 ~ 0.9, prediction horizon ~ (1/lambda) * ln(L/delta)
  • **Routes to chaos:** period-doubling (Feigenbaum constant 4.669), quasiperiodic, intermittency
  • **Edge of chaos in ML:** ESN spectral radius ~ 1 = lambda_max -> 0. LSTM/GRU - engineering approach to holding this boundary

Related Topics

Chaos connects to several branches of modern mathematics:

  • Bifurcations — Chaos typically arises through a cascade of bifurcations - the route from order to chaos via period-doubling
  • Fractals: Mandelbrot and Julia — Strange attractors have fractal structure; the Julia set is the boundary between chaotic and regular dynamics
  • Ergodic Theory — Chaotic systems are often ergodic - a time average is enough to study the statistics

Вопросы для размышления

  • Meteorologists run 50-100 parallel forecasts (ensemble forecasting). The spread between forecasts is itself a forecast variable. What exactly does it predict in terms of Lyapunov exponents?
  • Echo State Networks work best at spectral radius ~ 1. LSTMs solve the same problem differently - through gates. Which architecture explicitly uses knowledge from dynamical systems theory, and which is engineering empiricism?
  • Can one distinguish deterministic chaos from true random noise in a time series? What method uses Lyapunov exponents for this purpose?

Связанные уроки

  • dyn-04 — Bifurcations - the path from order to chaos; Feigenbaum period-doubling
  • dyn-06 — Strange attractors have fractal structure; Julia sets mark the boundary of chaos
  • dyn-07 — Chaotic systems are often ergodic - time average equals ensemble average
  • dyn-03 — Lyapunov exponents build on Lyapunov stability theory
  • de-03 — Systems of ODEs are the mathematical language of Lorenz and Rossler attractors
  • nm-01
Chaos and Strange Attractors

0

1

Sign In