Data Science

Time Series Analysis

Walmart stores 10 years of sales data for every product in every store - 400 million rows. Three days before a hurricane, Pop-Tart sales spike 7x. Predicting this in advance means stocking shelves correctly and not losing USD 50 million in revenue. This is what time series models do at production scale.

**Amazon Forecast** uses DeepAR (a global LSTM model) for demand forecasting across 30 million products - accuracy improved 36% over ARIMA baselines
**Uber** forecasts driver demand in 30-minute slots for each city zone using Prophet with custom seasonality components
**ECG monitoring**: cardiac arrhythmia detection through time series anomaly analysis is the foundation of the Apple Watch ECG feature

ARIMA

In 1970, Box and Jenkins published a book that changed forecasting forever. Their ARIMA model (AutoRegressive Integrated Moving Average) describes a time series with three parameters: **AR(p)** - dependence on p previous values, **I(d)** - differencing order to achieve stationarity, **MA(q)** - moving average of q past forecast errors. Amazon uses ARIMA for short-term demand forecasting across 30 million SKUs every day.

Key concepts: stationarity (mean and variance don't change over time - tested with the Augmented Dickey-Fuller test), ACF/PACF (autocorrelation / partial autocorrelation - for choosing p and q), AIC/BIC (model selection criteria). Parameter selection: auto_arima from pmdarima automatically searches (p,d,q) combinations by AIC. ARIMA limitation: linear method, poorly captures complex nonlinear patterns and multiple seasonality.

What does the parameter 'd' in ARIMA(p,d,q) represent?

Facebook Prophet

In 2017, Facebook open-sourced Prophet - a time series model that an analyst without a statistics background can apply in 10 lines of code. The approach: instead of ARIMA parameters, Prophet decomposes the series into **trend** (piecewise linear or logistic growth), **seasonality** (Fourier series for yearly, weekly, daily patterns) and **holidays** (custom user-defined events). It is robust to missing data and automatically detects trend change points.

Prophet model: y(t) = trend(t) + seasonality(t) + holidays(t) + error. Trend: linear (with changepoints) or logistic (for capacity-limited growth). Seasonality: Fourier series up to order N (yearly: N=10, weekly: N=3). Changepoints: automatically detected trend-rate change points (default: 25 in 80% of the series). Bayesian estimation via Stan (MCMC or MAP) provides forecast uncertainty for free.

How does Prophet handle automatically detected changepoints (trend change points)?

Seasonality and Decomposition

Ice cream sales peak in summer. Website traffic drops on Sundays. A heartbeat repeats every 0.8 seconds. All of these are **seasonality**: predictable repeating patterns with a fixed period. Classical decomposition splits a series into three components: trend (T), seasonality (S) and remainder (R). Additive model: y = T + S + R (seasonal amplitude is constant). Multiplicative model: y = T * S * R (amplitude grows with the trend - typical for business data).

Decomposition methods: (1) STL (Seasonal and Trend decomposition using Loess) - robust to outliers, handles multiple seasonalities, recommended by default; (2) X-13ARIMA-SEATS - standard used by government statistical agencies; (3) MSTL (Multiple STL) - for hourly data with simultaneous daily and weekly seasonality. statsmodels includes seasonal_decompose and STL. Common seasonal periods: 12 (months), 7 (days of week), 24 (hours), 52 (weeks).

When should a multiplicative decomposition model be used instead of additive?

Forecasting and Model Evaluation

Time series forecasting is not just model fitting. The critical error: evaluating quality on the same data the model was trained on. Due to autocorrelation, standard cross-validation gives overly optimistic estimates. The correct approach: **walk-forward validation** (rolling training/test window). The model is always trained on the past and tested on the future, exactly as in production deployment.

Forecasting quality metrics: MAE (mean absolute error, interpretable in original units), RMSE (penalizes large errors more heavily), MAPE (percentage error, undefined when values are zero), SMAPE (symmetric MAPE, range 0-200%). Baseline models for comparison: Naive (forecast = last observation), Seasonal Naive (forecast = same period from previous season), Simple Exponential Smoothing. If a complex model can't beat the baseline, the problem is in data quality or feature engineering.

More complex models always produce better time series forecasts

Simple models (Exponential Smoothing, ARIMA) frequently outperform deep neural networks on short series with well-defined seasonality

Time series are typically short (a few hundred points) and seasonality is easily captured parametrically. Neural networks (LSTM, Transformer) win when trained globally on thousands of series simultaneously (Amazon DeepAR, Google N-BEATS).

Why is standard k-fold cross-validation inappropriate for time series?

Key Ideas

**ARIMA(p,d,q)** is the statistical workhorse: AR captures dependence on past values, I differences to stationarity, MA accounts for past forecast errors
**Prophet** decomposes series into trend + seasonality + holidays via Fourier series and Bayesian changepoint detection - interpretable and robust to missing data
**Walk-forward validation** is mandatory for honest evaluation: the model is always trained on the past and tested on the future, mirroring real production conditions

Вопросы для размышления

If two analysts fit ARIMA to the same series and choose different (p,d,q) parameters, who is right? How do you objectively compare the models?
Prophet automatically detects changepoints. What happens if training data includes a single anomalous month due to a pandemic?
Walk-forward validation always trains on all available history. When can data from three years ago harm forecast quality rather than help it?

Связанные уроки

stat-13-time-series