Optimal Transport

Unbalanced OT and Biology

Stable Diffusion generates images via flow matching, which is optimal transport in continuous time. Waddington OT reconstructed the full cell genealogical tree of reprogramming from 65000 single-cell measurements. Unbalanced OT is the tool when classical OT fails: when mass is not conserved, when cells divide and die, when sentences have different lengths.

  • **Waddington OT (Cell 2019):** 65000 cells, 40 time points, reconstruction of iPSC reprogramming trajectories. The largest OT study in biology.
  • **MOSCOT (2023):** Theislab Helmholtz Munich: multi-omics alignment via OT. Matching RNA + protein + ATAC measurements from different cells.
  • **PyTorch Geometric + POT:** Sinkhorn Divergence as a differentiable loss for 3D point clouds, standard in shape analysis and molecule generation.

When Masses Do Not Match: Cells Die and Divide

Stable Diffusion generates 10M+ images per day via flow matching, which is OT in continuous time. Classical OT requires equal total mass. Unbalanced OT removes this restriction through KL soft marginal constraints: cells divide and die, sentences have different lengths - all of these are unbalanced transport.

Classical OT: min over P >= 0 of <C, P> subject to P * 1 = mu, P^T * 1 = nu (hard boundary conditions). Unbalanced OT: min over P >= 0 of <C, P> + tau * KL(P * 1 | mu) + tau * KL(P^T * 1 | nu). The parameter tau > 0 controls 'hardness': as tau goes to infinity, classical OT is recovered.

**Unbalanced OT:** UOT(mu, nu) = min over P >= 0 of [sum_{ij} C_{ij} * P_{ij} + tau * KL(P_{i+} | mu_i) + tau * KL(P_{+j} | nu_j)] where KL(a | b) = sum_i a_i * log(a_i / b_i) - a_i + b_i **Sinkhorn version with regularization epsilon:** P = diag(u) * exp(-C / epsilon) * diag(v) Updates (generalized Sinkhorn): u_i <- (mu_i / a_i)^(tau / (tau + epsilon)) where a_i = sum_j K_{ij} * v_j v_j <- (nu_j / b_j)^(tau / (tau + epsilon)) where b_j = sum_i K_{ij} * u_i As tau -> infinity: (.)^(tau/(tau+epsilon)) -> (.)^1, the ordinary Sinkhorn. At tau = epsilon: (.)^0.5, a soft constraint.

In single-cell RNA sequencing, cells at the start of an experiment (t = 0) and at the end (t = T) are different cells. You cannot track a specific cell. Unbalanced OT reconstructs probabilistic trajectories: with what probability does a type-A cell at t = 0 become type B at t = T? Cell division (proliferation) and apoptosis are accounted for by allowing imbalance.

The parameter tau in unbalanced OT: what happens to the transport plan as tau -> 0 and tau -> infinity?

tau penalizes deviation from the marginals via KL divergence with weight 1/tau. As tau -> 0 the penalty becomes infinite and the constraints become hard (classical OT). As tau -> infinity the penalty vanishes and mass is freely created or destroyed.

Sinkhorn Divergence: A Fair Metric Without Bias

Regularized OT with parameter epsilon has a well-known flaw: W_epsilon(mu, mu) > 0. A distribution does not have zero distance to itself. This is entropic bias and it breaks the use of OT as a loss in ML. Sinkhorn Divergence fixes it.

**Sinkhorn Divergence:** S_epsilon(mu, nu) = OT_epsilon(mu, nu) - (1/2) * OT_epsilon(mu, mu) - (1/2) * OT_epsilon(nu, nu) Properties: - S_epsilon(mu, mu) = 0 (no bias) - S_epsilon(mu, nu) >= 0 (non-negative) - S_epsilon(mu, nu) = 0 if and only if mu = nu - Differentiable in mu, so backprop works Sits between W_1 (Wasserstein without regularization) and MMD (Maximum Mean Discrepancy). As epsilon -> 0: S_epsilon -> W (classical Wasserstein) As epsilon -> infinity: S_epsilon -> MMD with kernel k(x, y) = -C(x, y) Applications: geom_loss in Python, loss for 3D point clouds in shape analysis, GAN training.

Why is OT_epsilon(mu, mu) > 0 a problem, and how does Sinkhorn Divergence fix it?

The regularizer epsilon * H(P) minimizes the entropy term even when mu = nu. Sinkhorn Divergence subtracts the self-terms and produces a symmetric function that is zero on the diagonal.

scRNA-seq: Optimal Transport for Cell Trajectories

**Waddington OT (Schiebinger et al., Cell 2019):** cell reprogramming traced via OT. 65000 single cells across 40 time points. Unbalanced OT reconstructed the full cell 'genealogical tree' of fibroblast reprogramming into iPSCs. A fundamentally new method in developmental biology.

**OT applications in biology and chemistry:** - Waddington OT: cell trajectories in single-cell data (Schiebinger 2019) - MOSCOT: multi-omics single-cell OT, Theislab (Helmholtz Munich, 2023) - FoldingDiff: generating protein structures via OT flow matching - Drug discovery: OT for aligning molecular fingerprints

Why is unbalanced OT used for cell trajectories in scRNA-seq?

Classical OT requires mass conservation: each cell at t_0 must map to a cell at t_1. But cells divide (1 -> 2), differentiate, or undergo apoptosis. Unbalanced OT models this through penalty terms.

Key ideas

  • **Unbalanced OT:** min_P [<C, P> + tau * KL(row marginals | mu) + tau * KL(col marginals | nu)]. Parameter tau controls constraint softness.
  • **Generalized Sinkhorn:** u <- (mu / Kv)^rho, v <- (nu / K^T u)^rho, where rho = tau / (tau + epsilon). As tau -> infinity rho -> 1 and we recover classical Sinkhorn.
  • **Sinkhorn Divergence:** S_eps = OT_eps(mu, nu) - OT_eps(mu, mu) / 2 - OT_eps(nu, nu) / 2. Removes entropic bias and is differentiable.
  • **scRNA-seq:** unbalanced OT reconstructs cell trajectories with proliferation and apoptosis.
  • **tau -> 0:** only transport without respecting marginals. **tau -> infinity:** strict marginals, classical OT.

Related topics

Unbalanced OT in the broader theory:

  • Sinkhorn algorithm — Unbalanced Sinkhorn is a direct generalization of the classical one
  • Gromov-Wasserstein — Next lesson: OT between spaces of different nature

Связанные уроки

  • ot-04-sinkhorn — The Sinkhorn algorithm generalizes to the unbalanced case
  • ot-13-schrodinger — Both use KL regularization, with different meanings
  • ot-14-gradient-flows — Unbalanced OT is a gradient flow under a generalized metric
Unbalanced OT and Biology

0

1

Sign In