Topology
Mapper Algorithm: Topological Skeleton of High-Dimensional Data
Цели урока
- Understand the four-step Mapper pipeline
- Choose an appropriate lens function for an exploratory task
- Tune resolution and gain with stability in mind
- Apply Mapper to neural-network interpretation and biomedical data
Предварительные знания
- Basic Mapper (top-29)
- Bottleneck and Wasserstein distances (top-33)
- Clustering (DBSCAN, single-linkage)
- Dimensionality reduction (PCA, UMAP)
Mapper found a new breast-cancer subtype that no clusterer could see - 9 patients with 100 percent survival.
- **Genomics**: breast-cancer subtype discovery with 100 percent survival in the identified flare group
- **Neuroscience**: fMRI brain dynamics via cyclic Mapper structures
- **Drug discovery**: mapping 100 thousand molecules in chemical space
- **Explainable AI**: topological interpretation of neural-network activations
- **Finance**: correlation-matrix analysis of financial instruments
Mapper from Stanford to Ayasdi
The Mapper algorithm was presented by Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson at the Symposium on Point-Based Graphics in 2007. In 2008 Carlsson, Singh, and Dara Entekhabi founded Ayasdi to commercialise TDA tools including Mapper. The turning point was the Nicolau, Levine, Carlsson paper in PNAS 2012: Mapper revealed a breast-cancer subtype with superior survival in 9 patients. This work became a landmark of TDA in biomedicine and drew wide attention. Today Mapper is implemented in open libraries (kepler-mapper, scikit-tda, Giotto-TDA) and widely used in computational biology and neuroscience.
Mapper Pipeline: From Data to Graph
2012. Monica Nicolau and Stanford colleagues run Mapper on 295 gene-expression profiles of breast-cancer patients. Standard k-means clustering finds the expected 3 subtypes. But Mapper reveals something different - a flare, a small connected branch of the topological skeleton, detached from the main network. The flare contains 9 patients. This group shows 100 percent five-year survival while the rest of the cohort shows 60 to 70 percent. Mapper found a new breast-cancer subtype no clusterer could see. The paper appeared in PNAS and became a flagship demonstration of TDA in biomedicine.
Mapper proceeds in four clear steps. The lens compresses data into a low-dimensional representation, the cover slices the image into overlapping pieces, clustering in preimages turns pieces into graph nodes, and the nerve from nodes and intersections yields the final topological skeleton. The output is a simplicial complex, typically visualised as a graph.
Mapper is neither classical statistics nor machine learning. It is a topological exploration tool: it visualises data connectivity making no distributional assumptions. Clustering produces nodes; Mapper produces a connectivity graph among them.
A ring in 100 dimensions
Mapper recovers topology
10000 points on a thin ring in R^100 with Gaussian noise sigma = 0.05. Lens = first PCA component. Cover = 20 intervals with 50 percent overlap. Clustering in one interval preimage finds about 2 clusters (two ring arcs at each lens level). The Mapper graph is a cyclic structure of 40 nodes. Ring topology is reconstructed correctly, noise is suppressed by the cover.
Mapper visualises the activation space of a neural-network final layer. Each node is a cluster of test examples with similar activations. Node colour is the dominant class. The Mapper graph shows the topological structure of decisions: which classes neighbour each other, where the decision boundary lies, which examples fit no cluster.
Unlike persistence diagrams, the Mapper graph depends on three arbitrary choices: lens, cover, and clustering. The same dataset yields different graphs under different parameters. This is both a strength (multiple aspects visible) and a weakness (no canonical answer).
Which Mapper pipeline step creates the graph edges?
Graph nodes come from preimage clustering. Edges arise later when building the nerve: if two clusters share at least one point (possible thanks to cover overlap), an edge connects them. Without overlap there would be no edges.
Choosing the Lens Function
The lens is the most influential Mapper parameter. The same dataset, different lenses, different graphs, different discoveries. The original 2012 paper used L2-eccentricity: the distance of each point to the cloud centre of mass. This lens highlighted peripheral patients - the flare group lived in the farthest eccentricity region.
Standard lens options: (1) L2-eccentricity - distance to centre, (2) density - count of neighbours in a fixed radius, (3) PCA projection - first or first two principal components, (4) UMAP/t-SNE embedding in R^2, (5) domain-specific - survival time, time to event, expression of one gene.
Genomics and lens choice
Different lenses, different discoveries
Dataset of 295 tumour expression profiles. With L2-eccentricity lens, Mapper finds a flare of 9 patients with extreme survival. With density lens, three dense cancer-subtype cores become visible. With expression-of-ESR1 (oestrogen receptor) lens, a gradient structure from ER-positive to ER-negative emerges. Each lens answers a different biological question.
An autoencoder bottleneck trained on the same data Mapper will run on provides an optimal lens for nonlinear structure. For 20000-gene genomic data, embedding to 2-3 dimensions. Mapper on this embedding reveals cell-differentiation trajectories.
If the lens itself contains errors (e.g. a poorly trained autoencoder), the Mapper graph inherits all those artefacts. Validate the lens separately: for instance check stability under bootstrap resampling of data.
No lens is universally best. Lens choice is part of the research process. Practice: run Mapper with multiple lenses, compare graphs, look for common stable structures. If every lens shows the same branch - that is a strong signal. If a structure vanishes under a lens swap - it is an artefact of a specific choice.
Why can the same dataset produce radically different Mapper graphs?
Three key parameters make Mapper an exploratory rather than canonical tool: lens determines which coordinate slices the data, cover sets interval size and overlap, clustering controls how many nodes appear in each piece. This flexibility powers discovery but demands interpretive care.
Cover Parameters and Stability
Cover is the second half of Mapper. After the lens, one decides how to slice the one-dimensional (or k-dimensional) image into pieces. Two parameters govern this: resolution r - number of intervals, gain g - overlap fraction between adjacent intervals. Small r yields a coarse graph, large r a detailed one. Small g leaves the graph disconnected, large g produces a tightly connected one.
Typical practical parameter values: r from 10 to 50 (depending on dataset size), g from 0.3 to 0.5 (standard). Larger gain yields a more connected graph, smaller gain produces more separate components. Parameter tuning is iterative with visual feedback.
Parameter sensitivity
Ring under varying r and g
Same dataset of 10000 ring points. r = 5, g = 0.3: graph collapses to 8 nodes with almost no cycles - structure lost. r = 20, g = 0.4: clear cycle of 40 nodes, topology recovered. r = 50, g = 0.6: 100 nodes with tangled edges, noise structure visible. The sweet spot usually sits in the middle range.
Running Mapper over a grid of (r, g) values yields a graph ensemble. Stable structures appearing over a wide grid region are deemed reliable. MultiNerve and Mapper Interleaving techniques formalise this: they select parameters under which Mapper output is stable.
Unlike persistence diagrams, Mapper lacks a universal 2007-style stability theorem. Carriere and Oudot in 2018 proved limited stability under careful parameter choice, but generally Mapper is unstable. Conclusions need care: one Mapper graph alone is not proof.
Modern libraries (kepler-mapper, scikit-tda, Giotto-TDA) automate parameter tuning: they run grid search and measure output similarity. If neighbouring grid cells produce graphs with small Wasserstein distance between Mapper diagrams, the configuration is deemed stable.
What happens to the Mapper graph when gain g grows from 0.3 to 0.7?
Gain controls cover overlap between adjacent intervals. More overlap means more shared preimage points, more cluster intersections, more edges. Node count stays roughly constant (driven by clustering within each piece).
Applications: Genomics, Neuroscience, Drug Discovery
Over fifteen years Mapper has become a standard topological-exploration tool across several scientific fields. Genomics uses it to find new disease subtypes. Neuroscience uses it to analyse brain functional connectivity. Pharma uses it to map the chemical space of potential drugs. Finance uses it to analyse market correlation structures. Mapper does not replace statistics or ML - it complements them with a topological perspective.
Five flagship Mapper applications: (1) breast-cancer subtype discovery (Nicolau 2012), (2) T-cell differentiation (Rizvi 2017), (3) fMRI brain dynamics (Saggar 2018), (4) chemical-space mapping in drug discovery, (5) market-correlation analysis in finance.
fMRI brain dynamics
Mapper beyond clustering
Saggar et al. 2018: 1500 seconds of fMRI data, split into 5-second time windows. Each window is an activity vector across 268 regions. Lens = L2-eccentricity (highlights unusual moments). Mapper exposes a loop - a cycle of recurring brain states invisible to classical clustering. This loop correlates with attention level and lets predicting cognitive load.
A Mapper graph of ResNet50 final-layer activations on ImageNet reveals: classes organise not discretely but as a continuous manifold. Cats and dogs neighbour topologically through breeds. This explains why the network confuses specific class pairs - they lie close on the graph.
Mapper generates hypotheses; it does not confirm them. Every discovery via Mapper (such as the flare in breast cancer) needs independent validation via statistical methods on held-out data. A graph alone is not a statistically significant statement.
Drug discovery
Chemical space mapping
A library of 100000 small molecules, each represented by a 1024-dimensional fingerprint (Morgan/ECFP). Lens = centrality in the chemical-similarity graph. Cover with 30 intervals, gain 0.4. Mapper exposes major structural families as graph branches, plus flares - molecules of unique topology that may be synthesis candidates. The picture helps prioritise candidates for biological testing.
Modern Mapper implementations: kepler-mapper (Python, actively maintained), scikit-tda (part of the scikit-learn ecosystem), Giotto-TDA (industrial TDA stack), TDAMapper (R). All support export to visualisation tools (NetworkX, Graphviz, D3.js) and integrate with pandas/sklearn pipelines.
What makes Mapper unique that clustering and dimensionality reduction separately cannot do?
Clustering partitions data into groups, losing connectivity information. Dimensionality reduction provides coordinates but no connectivity. Mapper combines both: lens for coordinates, clustering for nodes, nerve for edges. The result is a topological skeleton visible in the original space.
Extensions and applications of Mapper
Mapper integrates into the modern data-science stack as a data-exploration tool.
- TDA in neural networks — Related topic
- Witness complexes — Related topic
- TDA for time series — Related topic
- Persistent homology (advanced) — Related topic
Key ideas
- Mapper is a four-step pipeline: lens, cover, clustering, nerve
- The lens chooses which aspect of the data to study - L2-eccentricity, density, learned embedding
- Resolution and gain control graph detail and connectivity
- Mapper has no universal stability - robustness comes from grid search and output comparison
- Applications: breast-cancer subtypes, brain dynamics, drug discovery, explainable AI
- Mapper generates hypotheses; it does not statistically confirm them
Вопросы для размышления
- Which data properties make Mapper more informative than k-means clustering?
- Why does production deployment demand parameter stability, and how is it assessed for Mapper?
- Which limitations does Mapper carry compared to persistence diagrams?
Связанные уроки
- top-29 — basic version of the Mapper algorithm
- top-33 — metrics for comparing Mapper outputs
- top-32 — stability for Mapper graphs
- top-37 — Mapper for neural network interpretability
- alg-12-bfs