Causal Calculus

Causal Discovery: PC, FCI, NOTEARS

In 2000, Peter Spirtes, Clark Glymour, and Richard Scheines published "Causation, Prediction, and Search". For the first time, an algorithm automatically recovered a causal graph from correlations - no experiments, only observational data. Science gained a tool for finding causality where experiments are impossible: epidemiology, economics, neuroscience.

  • Epidemiology: reconstructing disease transmission pathways from observational data
  • Neuroscience: causal connectivity between brain regions from fMRI without stimulation
  • Economics: direction of causality between macroeconomic variables (GDP, inflation, employment)
  • ML: discovering causal structure among features for robust out-of-distribution generalization

Предварительные знания

  • DAGs and d-separation
  • Conditional independence
  • Markov condition and faithfulness
  • Transportability (previous lesson)
  • Transportability and Selection Diagrams

Constraint-based: PC and FCI algorithms

In 2000, Peter Spirtes, Clark Glymour, and Richard Scheines published "Causation, Prediction, and Search". For the first time, an algorithm automatically recovered a causal graph from correlations - no experiments required. The core insight: conditional independence encodes DAG structure.

The **PC algorithm** operates in three steps: (1) start with a complete graph, (2) remove edges between conditionally independent variables, (3) orient colliders X → Z ← Y wherever X and Y are non-adjacent but both connected to Z.

**PC assumes:** 1. faithfulness - all conditional independences in the data reflect d-separations in the DAG 2. causal sufficiency - no hidden common causes. **FCI relaxes assumption (2)**, but returns a PAG (Partial Ancestral Graph) with ambiguous edge marks rather than a DAG.

The PC algorithm found that X and Y are conditionally independent given Z. What does this imply for the DAG structure?

Score-based: GES and NOTEARS

Instead of testing conditional independences, score-based methods pose an optimization problem: find the DAG that maximizes a scoring criterion. Two main approaches: GES (combinatorial) and NOTEARS (continuous optimization).

**GES (Greedy Equivalence Search)** operates in the space of Markov equivalence classes (CPDAGs). Phase 1: greedily add edges while BIC improves. Phase 2: greedily remove edges while BIC improves. Guaranteed to find the correct CPDAG under faithfulness with sufficient data.

**NOTEARS turns discrete DAG search into continuous optimization.** The space of DAGs is discrete and exponentially large, making combinatorial search intractable. The constraint h(W) = tr(e^(W⊙W)) - p = 0 is differentiable and enables gradient descent with Lagrangian augmentation.

GES operates in the space of CPDAGs rather than individual DAGs. Why does this matter?

Markov equivalence and identifiability limits

Causal discovery from observational data has a fundamental limit: the Markov Equivalence Class (MEC) contains multiple DAGs that are indistinguishable by their distributions. Without additional assumptions, the direction of causality cannot be determined.

**Two DAGs are Markov equivalent** (belong to the same MEC) if and only if they have the same skeleton and the same v-structures (colliders). For example, X → Y and X ← Y are indistinguishable from observations - both produce the same joint distribution.

**Assumptions required for causal discovery:** 1. **Faithfulness** - all conditional independences in P reflect d-separations in the DAG (no accidental cancellations between paths) 2. **Causal sufficiency** - no hidden common causes 3. A correctly specified functional class (linear/nonlinear, Gaussian/non-Gaussian). Violating any of these breaks the guarantees.

Two DAGs share the same skeleton and the same v-structures. What can be said about their joint distributions?

Summary

  • **PC algorithm:** conditional independences → skeleton → collider orientation → CPDAG
  • **FCI:** extends PC to hidden confounders, returns a PAG instead of a CPDAG
  • **GES:** optimizes BIC score in the space of CPDAGs, provably correct under faithfulness
  • **NOTEARS:** continuous formulation via h(W)=0 - enables gradient descent for DAG search
  • **MEC:** equivalent DAGs are indistinguishable from observations; LiNGAM identifies direction under non-Gaussian noise

Related topics

Causal discovery bridges statistics and causal inference.

  • Do-calculus and interventions — Causal discovery is the prerequisite: find the graph, then compute do(X=x)
  • Transportability — The recovered graph is used to transport results across populations
  • Counterfactuals — An identified DAG enables answering counterfactual queries

Вопросы для размышления

  • PC assumes faithfulness. Construct a scenario where faithfulness fails (two paths with effects that cancel). How would that affect the algorithm's output?
  • NOTEARS converts discrete search into continuous optimization. What new problems does this introduce (local minima, choice of λ, constraint satisfaction)?
  • If you receive a CPDAG with several undirected edges, what experiment would you run to orient them?

Связанные уроки

  • stat-27-graphical-models
Causal Discovery: PC, FCI, NOTEARS

0

1

Sign In