Topology

The Mapper Algorithm

Цели урока

Implement the Mapper algorithm: filter function, cover, clustering, nerve complex
Understand the nerve lemma as the theoretical justification for Mapper
Interpret the Mapper graph and tune its hyperparameters
Apply Mapper to biomedicine, finance, and neuroscience case studies

Предварительные знания

Persistent Homology (Advanced)
Topological Data Analysis
Clustering algorithms

How can a million data points be turned into a readable map that preserves the topology? Mapper builds a graph where each node is a cluster of points and edges show how clusters overlap, turning high-dimensional clouds into interpretable topology.

Oncology: Ayasdi found the c5 cancer subtype with Mapper applied to genomic data from 271 patients
Neuroscience: topological map of neural activations under different stimuli reveals representational geometry
Finance: market regime detection (normal vs crisis) via Mapper on return time series
Immunology: Mapper of T-cell receptor space identified rare immune cell subtypes

From Eurographics to Clinical Discovery

Gunnar Carlsson, Vin de Silva, and Afra Zomorodian built the mathematical foundations of TDA in the 2000s. Singh, Memoli, and Carlsson published Mapper in 2007 at Eurographics. In 2008 the team founded Ayasdi to commercialize TDA. The first landmark result - discovery of cancer subtype c5 - appeared in Science Translational Medicine in 2011. By the 2020s, KeplerMapper, Giotto-TDA and Gudhi became standard open-source tools.

Mapper Algorithm and Topological Visualization

Singh, Memoli, and Carlsson proposed the Mapper algorithm at Eurographics 2007. Ayasdi applied it to breast cancer genomic data (n=271) in 2011, revealing the previously unknown subtype c5 with markedly improved prognosis. The same pipeline is now used in immunology, neuroscience, and financial risk.

How does the overlap parameter affect the Mapper graph?

Correct. Without overlap, cover elements share no points and no edges form. Too much overlap can merge distinct topological features into one component.

Nerve Lemma and Theoretical Foundations

The nerve lemma is the theoretical justification for Mapper. When all intersections of cover elements are contractible, the nerve complex is homotopy equivalent to the union. Mapper exploits this: clusters approximate contractible components, so the nerve graph approximates the topology of the data.

Mapper for financial time series

Detecting market regimes via Mapper

For a stock price time series, build a sliding window: each day becomes a vector of the last 30 returns. Filter function = volatility (standard deviation). Mapper reveals two regimes: low volatility (normal market) and high volatility (crisis). The Mapper graph shows transitions between regimes as edges connecting the two clusters.

Mapper parameter	Small value	Large value
Number of intervals	Coarse topology, few nodes	Detailed topology, many nodes
Overlap %	Few edges, fragmentation	Many edges, component merging
Clustering eps	Many small clusters	Few large clusters

What does the nerve lemma guarantee for the Mapper algorithm?

Correct. Nerve lemma: when all intersections are contractible (ensured by sufficient overlap), the nerve captures the homotopy type of the data.

Mapper Applications in Data Science

Mapper is not just an algorithm - it is a paradigm. It converts a point cloud into a graph that can be visualized, analyzed, and interpreted. Color-coding nodes by clinical variables lets analysts find statistically significant subgroups invisible to standard clustering.

Interpreting the Mapper graph: branches = potential subgroups. Loops = nonlinear variability. Isolated components = anomalous clusters. Color-coding nodes by an external variable (survival, cell type) reveals biologically meaningful patterns - the key practical skill in TDA applications.

What does a loop in the Mapper graph represent?

Correct. A loop in the Mapper graph corresponds to an H1 class of the nerve complex - nonlinear structure in the data (a gradient, a cycle of changes).

Connections to other topics

Mapper bridges topological theory and practical high-dimensional data analysis.

Unsupervised learning — Related topic
Genomics and bioinformatics — Related topic
Data visualization — Related topic
TQFT — Related topic

Итоги

Mapper: f: X -> R^k (filter), U_alpha (cover of f(X)), cluster preimages, nerve graph
Three hyperparameters: n_intervals (resolution), overlap, clustering eps
Nerve lemma: Nerve(U) ~ union U_alpha when all intersections are contractible
Loops in graph = H1 features; branches = potential subgroups; isolated components = anomalies
Applications: oncology (subtype c5), neuroscience (activation maps), finance (market regimes)
KeplerMapper: Python library with sklearn-compatible clusterers and d3.js visualization

Вопросы для размышления

Why is the choice of filter function critical for interpreting the Mapper graph?
How does the nerve lemma justify that the Mapper graph reflects the real topology of the data?
What is the fundamental difference between Mapper and standard clustering methods like k-means or DBSCAN?

Связанные уроки

top-28 — Stability theorem and persistence modules justify Mapper
top-27 — Persistent homology is the main TDA tool preceding Mapper
top-23 — The nerve lemma uses homology of covers to justify Mapper
aa-14-representations