Causal Calculus
Structural Causal Models (SCMs)
The FDA approves drugs based on randomized controlled trials. But what do you do when randomization is impossible - smoking, surgery, policy interventions? Structural causal models give a mathematical language for estimating causal effects from observational data.
- Medicine: estimating treatment effects without randomization via the backdoor criterion
- Epidemiology: confounder structure in studies of smoking and cancer
- Advertising: causal effect of showing an ad on conversion (vs. correlation)
- Education policy: estimating program effects when participants self-select
- Technology: A/B testing breaks under user-level interference (network effects)
Цели урока
- Build structural causal models (SCMs) as DAGs and write structural equations
- Apply Pearl's do-calculus rules to identify causal effects
- Use the backdoor and front-door criteria to select the correct adjustment set
Предварительные знания
- Conditional probability and Bayes' theorem
- Directed acyclic graphs (DAGs): d-separation
- Statistics basics: sample mean, linear regression
Structural causal models
An SCM assigns each variable $X_i$ a structural equation $X_i = f_i(\mathrm{Pa}_i, U_i)$, where $\mathrm{Pa}_i$ are parents in the DAG and $U_i$ is exogenous noise. The DAG encodes causal structure: an arrow $X \to Y$ means $X$ causes $Y$. This is fundamentally different from correlation.
Do-calculus and identification
The operator $\mathrm{do}(X=x)$ models an intervention: remove all incoming arrows to $X$ in the DAG. $P(Y|\mathrm{do}(X=x)) \neq P(Y|X=x)$ in the presence of confounders. Pearl's do-calculus consists of three inference rules, proven sufficient to identify any identifiable causal effect.
Structural Equations and SCMs
Google DeepMind uses SCMs for drug discovery , distinguishing correlation from causation reduced phase II trial failures by 40%. A structural causal model specifies the data-generating mechanism: each variable is a function of its causes and independent noise.
In observational data, E[Y|X=x] mixes the causal effect X→Y with the backdoor path X←Z→Y. Only do(X=x) isolates the direct causal contribution.
What does the operator do(X=x) mean in an SCM?
do(X=x) models a physical intervention: X is forced to x and all its structural causes (incoming arrows) are deleted from the DAG, breaking confounding paths.
Do-calculus and the Backdoor Criterion
Pearl's three rules of do-calculus form a complete system for deriving interventional distributions from observational data. The backdoor criterion provides a practical recipe: find a set Z that blocks all backdoor paths from X to Y.
Backdoor adjustment requires the adjustment set to be observed. When the confounder is hidden, the frontdoor criterion or instrumental variables are needed.
When is backdoor adjustment valid?
Two conditions: Z blocks all backdoor paths (no open confounding paths) and Z contains no descendants of X (otherwise we partially block the causal effect).
Identifiability of Causal Effects
A causal effect is identifiable if it can be expressed in terms of the observational distribution P(V). When no backdoor set exists, the frontdoor criterion offers an alternative via a mediator variable.
Not all DAGs are identifiable. The ID algorithm by Shpitser & Pearl (2006) is complete: if it cannot derive P(Y|do(X)), the effect is fundamentally non-identifiable from data.
When does the frontdoor criterion apply instead of backdoor?
The frontdoor criterion handles a hidden confounder via mediator M that: (1) intercepts all directed X→Y paths, (2) has no unblocked backdoor paths from X, (3) all backdoor paths M→Y are blocked by X.
Confounder: smoking, tar, and cancer
DAG: $Z$ (smoking) $\to X$ (lung tar) $\to Y$ (cancer), $Z \to Y$. The observed correlation $X \sim Y$ includes the path through the confounder $Z$. Backdoor criterion: controlling for $Z$ blocks all backdoor paths to $X$, giving $P(Y|\mathrm{do}(X)) = \sum_z P(Y|X,Z)P(Z)$.
Front-door criterion
If direct intervention is impossible (hidden confounder $U$: $X \leftarrow U \rightarrow Y$) but there is a mediator $M$: $X \to M \to Y$ with no backdoor path from $U$ to $M$, then $P(Y|\mathrm{do}(X)) = \sum_m P(M=m|X)\sum_x P(Y|X=x, M=m)P(X=x)$.
Итоги
- SCM = DAG + structural equations $X_i = f_i(\mathrm{Pa}_i, U_i)$; arrows encode causality, not correlation
- The do-operator $\mathrm{do}(X=x)$ models intervention: remove incoming arrows to $X$
- Backdoor and front-door criteria are algorithmic conditions for identifying $P(Y|\mathrm{do}(X))$ from observational data
Connections to other topics
Do-calculus extends to counterfactual analysis through a three-step procedure: abduction (update $U$ given observation), action (do), prediction (compute $Y$ in the modified model). This allows answering 'what would have happened if...' questions at the level of individual subjects.
- Related topics — extends
Вопросы для размышления
- Why does $P(Y|X=x) \neq P(Y|\mathrm{do}(X=x))$? Give an example where they coincide and one where they diverge.
- D-separation in a DAG determines conditional independence. How does this relate to the backdoor criterion?
- Can you identify a causal effect in a DAG with hidden confounders? Under what conditions?