Game Theory

Evolutionary Game Theory

Why do both aggressive and peaceful strategies coexist in nature, with neither driving out the other? Why does altruism evolve at all if selfishness pays better? Why don't trading algorithms on a market converge to a single strategy? Evolutionary game theory answers all of this without assuming rationality: successful strategies just reproduce more.

  • **Behavioral biology:** aggression, territoriality, mating displays, altruism - all predicted by ESS without assuming "conscious choice" in animals
  • **Multi-agent RL:** training multiple RL agents in a shared environment is a non-stationary problem. Understanding replicator dynamics helps design algorithms that actually converge
  • **Financial markets:** the population of trading strategies (momentum, value, arbitrage) follows evolutionary dynamics - profitable ones get copied, losers vanish

Предварительные знания

  • Auction Theory

Evolutionarily Stable Strategy (ESS)

Consider a population of birds. Most are "doves" (peaceful). A mutant "hawk" (aggressive) appears. Can it take over? It depends on whether being a hawk is profitable in a world of doves, and being a dove is profitable in a world of hawks. An ESS (Evolutionarily Stable Strategy) is a strategy that mutants cannot invade: if most of the population plays ESS, a rare mutant loses.

Strategy I is an ESS if for every mutant strategy J ≠ I, at least one holds: 1. u(I, I) > u(J, I) - I beats J against a population of I-players 2. u(I, I) = u(J, I) AND u(I, J) > u(J, J) - tie against I, but I beats J against J ESS ⊆ NE: every ESS is a Nash Equilibrium, but not every NE is an ESS.

In the Hawk-Dove game, neither pure Hawk nor pure Dove is an ESS. The ESS is a mixed strategy: hawk fraction p* = v/c. With v=4, c=6: p* = 2/3. This is **stable polymorphism**: hawks and doves coexist in the population. Evolution "discovers" the same equilibrium as rational calculation!

John Maynard Smith and the revolution in biology

John Maynard Smith and George Price introduced ESS in 1973 as a way to analyze animal behavior without assuming rationality. Instead of "agents choose strategies," strategies reproduce or die out depending on success. This unified game theory and evolutionary biology, transforming the understanding of ritual conflicts, mating displays, and cooperation.

Evolutionary game theory only applies to biology

ESS and replicator dynamics apply to economics (evolution of norms), CS (reinforcement learning), sociology (cultural practices), and finance (trading strategies).

Evolutionary dynamics describe any process where successful strategies "reproduce." This includes firms (survivors copy profitable ones), algorithms (RL), and cultures (imitation).

Strategy I is an ESS when:

Replicator Dynamics

ESS indicates where the system will settle. Replicator dynamics describes how it gets there. The idea is simple: a strategy reproduces faster if its fitness exceeds the population average. This gives a differential equation for the share of each strategy - the evolutionary analog of gradient ascent.

For strategies {1,...,k} with shares x = (x₁,...,xₖ): dxᵢ/dt = xᵢ · [f(i, x) - φ(x)] where f(i, x) = Σⱼ xⱼ · u(i, j) - fitness of strategy i, φ(x) = Σᵢ xᵢ · f(i, x) - average fitness. Property: equilibria of the replicator = Nash Equilibria. ESS = asymptotically stable fixed points.

Replicator dynamics is a continuous approximation of evolutionary selection. It has a direct connection to machine learning algorithms: Multiplicative Weights Update (MWU) and several reinforcement learning algorithms are discrete versions of replicator dynamics. Online learning "rediscovered" game theory from the other direction.

Replicator dynamics always converges to an ESS

Replicator dynamics converges to Nash Equilibria, not necessarily to an ESS. Some NE are unstable fixed points; dynamics diverges away from them.

ESS are the asymptotically stable equilibria of the replicator. A stable NE that isn't an ESS will only be reached if the starting state is exactly at that point.

What does the replicator equation dxᵢ/dt = xᵢ · [f(i,x) - φ(x)] describe?

Hawk-Dove Game and Polymorphism

The Hawk-Dove game is the central example of evolutionary game theory. A hawk always attacks and risks injury. A dove retreats from a hawk, but peacefully shares a resource with another dove. When the resource is worth more than the risk of conflict, only hawks; when conflict is too costly, only doves. Usually, both coexist in equilibrium.

Payoff matrix (for one player): Resource v, injury cost c H D H (v-c)/2 v D 0 v/2 If v < c (conflict is costly): the only ESS is a mixed strategy p* = v/c hawks. If v > c (conflict is cheap): ESS = all hawks (AllH). At p*, hawk fitness equals dove fitness - marginal selection is zero.

Polymorphism (coexistence of strategies) is one of the key predictions of evolutionary game theory. In biology: different mating strategies in male fish (aggressive alpha and "sneaker" beta). In behavioral economics: varying levels of trust in economic experiments. One strategy rarely "wins" - stable polymorphism is common.

Evolution always leads to the best outcome for the group

Evolution optimizes individual fitness, not group welfare. The ESS in Hawk-Dove is not socially optimal: all doves would yield a higher total payoff.

Group selection (selection at the group level) is weak compared to individual selection. That's why altruism, cooperation, and reduced competition require special mechanisms (kin selection, reputation).

In the Hawk-Dove game with v < c (conflict costs more than the resource), what is the ESS?

Population Dynamics and Multi-Agent Systems

Evolutionary game theory describes more than biological populations. Any system where strategies "reproduce" based on success - a market of firms, a pool of trading algorithms, A/B tests in a product, reinforcement learning - follows the same dynamics. This makes ESS and replicator equations useful tools for analyzing multi-agent systems.

Multiplicative Weights Update (MWU): wᵢ(t+1) = wᵢ(t) · exp(η · rᵢ(t)) This is a discrete version of replicator dynamics. As η → 0, continuous time → replicator equation. Q-learning, Cross learning, and many online algorithms are variants of this dynamic. In multi-agent systems, their interactions are analyzed through evolutionary game theory.

DomainESS AnalogReplicator Dynamics
BiologyStable phenotypesGenetic drift + selection
EconomicsDominant market strategiesImitating profitable firms
ML (Multi-Agent RL)Nash policy profilesPolicy gradient, Q-learning
CultureStable behavioral normsSpread through imitation

The key question for multi-agent systems: does the population converge to an ESS, cycle periodically, or behave chaotically? In zero-sum games (Rock-Paper-Scissors), replicator dynamics produces persistent cycles. In coordination games, it converges quickly to an ESS. This matters for designing RL algorithms in multi-agent environments.

In multi-agent learning, algorithms always converge to Nash Equilibrium

Most standard RL algorithms in multi-agent games do NOT converge to NE - they cycle, diverge, or converge to other points.

Only special algorithms (CFR, modified MCTS, PSRO) guarantee convergence to NE. Standard Q-learning or Policy Gradient in a multi-agent environment treats it as a non-stationary environment, with no convergence guarantees.

Multiplicative Weights Update (MWU) in machine learning is:

Key Ideas

  • **ESS:** a strategy resistant to mutant invasion - an evolutionary refinement of NE that's strictly stronger
  • **Replicator dynamics:** dx/dt = x·(f - φ) - successful strategies grow faster than average
  • **Hawk-Dove game:** ESS is polymorphism p* = v/c hawks; evolution doesn't reach the social optimum
  • **MWU = replicator dynamics:** online ML algorithms are discrete versions of evolutionary dynamics

Related Topics

Evolutionary game theory connects biology, ML, and economics:

  • Prisoner's Dilemma — Evolution of cooperation through repeated games and replicator dynamics
  • Algorithmic Game Theory — Price of Anarchy in systems with evolving agents
  • Game Theory in ML — GAN training and multi-agent RL through the lens of evolutionary theory

Вопросы для размышления

  • If trading algorithms follow replicator dynamics (profitable ones get copied), what should the market converge to in theory? Why doesn't that happen in practice?
  • Engineering teams also undergo "evolution of practices": successful engineers copy the approaches of successful colleagues. Is this replicator dynamics? What would the ESS look like?
  • Multi-agent RL is often unstable and fails to converge to NE. Which architectural choices (centralized training, communication, shared reward) help? How does ESS connect to this?

Связанные уроки

  • prob-17
Evolutionary Game Theory

0

1

Sign In