Topology
The Mapper Algorithm
Цели урока
- Implement the Mapper algorithm: filter function, cover, clustering, nerve complex
- Understand the nerve lemma as the theoretical justification for Mapper
- Interpret the Mapper graph and tune its hyperparameters
- Apply Mapper to biomedicine, finance, and neuroscience case studies
Предварительные знания
- Persistent Homology (Advanced)
- Topological Data Analysis
- Clustering algorithms
How can a million data points be turned into a readable map that preserves the topology? Mapper builds a graph where each node is a cluster of points and edges show how clusters overlap, turning high-dimensional clouds into interpretable topology.
- Oncology: Ayasdi found the c5 cancer subtype with Mapper applied to genomic data from 271 patients
- Neuroscience: topological map of neural activations under different stimuli reveals representational geometry
- Finance: market regime detection (normal vs crisis) via Mapper on return time series
- Immunology: Mapper of T-cell receptor space identified rare immune cell subtypes
From Eurographics to Clinical Discovery
Gunnar Carlsson, Vin de Silva, and Afra Zomorodian built the mathematical foundations of TDA in the 2000s. Singh, Memoli, and Carlsson published Mapper in 2007 at Eurographics. In 2008 the team founded Ayasdi to commercialize TDA. The first landmark result - discovery of cancer subtype c5 - appeared in Science Translational Medicine in 2011. By the 2020s, KeplerMapper, Giotto-TDA and Gudhi became standard open-source tools.
Mapper Algorithm and Topological Visualization
Singh, Memoli, and Carlsson proposed the Mapper algorithm at Eurographics 2007. Ayasdi applied it to breast cancer genomic data (n=271) in 2011, revealing the previously unknown subtype c5 with markedly improved prognosis. The same pipeline is now used in immunology, neuroscience, and financial risk.
How does the overlap parameter affect the Mapper graph?
Correct. Without overlap, cover elements share no points and no edges form. Too much overlap can merge distinct topological features into one component.
Nerve Lemma and Theoretical Foundations
The nerve lemma is the theoretical justification for Mapper. When all intersections of cover elements are contractible, the nerve complex is homotopy equivalent to the union. Mapper exploits this: clusters approximate contractible components, so the nerve graph approximates the topology of the data.
Mapper for financial time series
Detecting market regimes via Mapper
For a stock price time series, build a sliding window: each day becomes a vector of the last 30 returns. Filter function = volatility (standard deviation). Mapper reveals two regimes: low volatility (normal market) and high volatility (crisis). The Mapper graph shows transitions between regimes as edges connecting the two clusters.
| Mapper parameter | Small value | Large value |
|---|---|---|
| Number of intervals | Coarse topology, few nodes | Detailed topology, many nodes |
| Overlap % | Few edges, fragmentation | Many edges, component merging |
| Clustering eps | Many small clusters | Few large clusters |
What does the nerve lemma guarantee for the Mapper algorithm?
Correct. Nerve lemma: when all intersections are contractible (ensured by sufficient overlap), the nerve captures the homotopy type of the data.
Mapper Applications in Data Science
Mapper is not just an algorithm - it is a paradigm. It converts a point cloud into a graph that can be visualized, analyzed, and interpreted. Color-coding nodes by clinical variables lets analysts find statistically significant subgroups invisible to standard clustering.
Interpreting the Mapper graph: branches = potential subgroups. Loops = nonlinear variability. Isolated components = anomalous clusters. Color-coding nodes by an external variable (survival, cell type) reveals biologically meaningful patterns - the key practical skill in TDA applications.
What does a loop in the Mapper graph represent?
Correct. A loop in the Mapper graph corresponds to an H1 class of the nerve complex - nonlinear structure in the data (a gradient, a cycle of changes).
Connections to other topics
Mapper bridges topological theory and practical high-dimensional data analysis.
- Unsupervised learning — Related topic
- Genomics and bioinformatics — Related topic
- Data visualization — Related topic
- TQFT — Related topic
Итоги
- Mapper: f: X -> R^k (filter), U_alpha (cover of f(X)), cluster preimages, nerve graph
- Three hyperparameters: n_intervals (resolution), overlap, clustering eps
- Nerve lemma: Nerve(U) ~ union U_alpha when all intersections are contractible
- Loops in graph = H1 features; branches = potential subgroups; isolated components = anomalies
- Applications: oncology (subtype c5), neuroscience (activation maps), finance (market regimes)
- KeplerMapper: Python library with sklearn-compatible clusterers and d3.js visualization
Вопросы для размышления
- Why is the choice of filter function critical for interpreting the Mapper graph?
- How does the nerve lemma justify that the Mapper graph reflects the real topology of the data?
- What is the fundamental difference between Mapper and standard clustering methods like k-means or DBSCAN?
Связанные уроки
- top-28 — Stability theorem and persistence modules justify Mapper
- top-27 — Persistent homology is the main TDA tool preceding Mapper
- top-23 — The nerve lemma uses homology of covers to justify Mapper
- aa-14-representations