Statistics

Bayesian Statistics

'Candidate X has a 70% chance of winning the election' - this is a Bayesian statement. Frequentist statistics can't make such claims: an election happens once, there's no 'limiting frequency.' Bayesian statistics quantifies uncertainty where the frequentist approach falls silent.

  • Netflix uses Bayesian models for personalization.
  • Spam filters (Naive Bayes) were one of the first applications.
  • Medical diagnosis, financial risk models, autonomous vehicles - anywhere uncertainty needs to be expressed quantitatively.

Предварительные знания

  • Hypothesis Testing: How p-values Killed 64,000 Studies

Bayesian vs. Frequentist Approach

**Two views on probability:** Frequentist: probability = limiting frequency over infinite repetitions. Cannot talk about the 'probability of a hypothesis' (it is either true or not). Bayesian: probability = degree of belief in the truth of a statement, updated as data arrive.

**The counterintuitive medical test result!** Even with a good test (95% sensitivity), for a rare disease (1% prevalence) a positive result means only ~9% probability of actually having the disease. That's why confirmatory tests are recommended. Base rate neglect - the failure to account for the prior - is a classic cognitive bias.

In the Bayesian approach, what is the 'prior'?

Bayesian Updating: From Prior to Posterior

**Bayes' Theorem:** P(θ|X) = P(X|θ) × P(θ) / P(X) Where: θ - parameter/hypothesis, X - data, P(θ) - prior, P(X|θ) - likelihood, P(θ|X) - posterior, P(X) - normalizing constant. Repeatedly updating as new data arrive is the central power of the Bayesian approach.

**Conjugate priors:** when the prior and posterior belong to the same distribution family. Examples: Beta/Binomial, Normal/Normal, Gamma/Poisson. This yields an analytical solution without MCMC. In production systems, used for online updating (A/B tests, recommender systems).

Prior: Beta(2, 2) for a coin's p. We observe 7 heads and 3 tails. What is the posterior?

Bayesian Inference in Practice: MCMC and A/B Tests

For complex models, an analytical posterior is out of reach. **MCMC (Markov Chain Monte Carlo)** is a family of algorithms for sampling from the posterior without computing it explicitly. PyMC and Stan are the main tools. For A/B tests, the Bayesian approach gives direct answers without p-values.

**Credible interval vs Confidence interval:** a 95% credible interval [a, b] means 'there is a 95% probability that the true parameter lies in [a, b]' - which is what most people naively assume a confidence interval means! A 95% confidence interval means: 'if the experiment were repeated infinitely, 95% of such intervals would contain the true value.' Bayesian inference gives the more intuitive interpretation.

A Bayesian A/B test shows P(B > A) = 0.92. What does this mean?

Key Ideas

  • Bayesian approach: probability = degree of belief, updated as data arrive
  • Bayes' Theorem: P(θ|X) ∝ P(X|θ) × P(θ) - likelihood × prior
  • Prior → Posterior: each observation refines beliefs
  • Conjugate priors (Beta/Binomial, Normal/Normal) yield analytical posteriors
  • Credible interval: P(θ ∈ [a,b] | data) = 0.95 - direct probabilistic interpretation
  • MCMC (PyMC, Stan) - for complex models without analytical posteriors

What's Next

Non-parametric tests are an alternative for data that violate the assumptions of parametric methods. Bayesian non-parametric models (e.g., Gaussian processes) combine both approaches.

  • Non-Parametric Tests — Non-parametric methods need no distributional prior; they work with ranks instead of values

Вопросы для размышления

  • How is a prior chosen when expert knowledge about a parameter is available? How does this change when a lot of data is available?
  • Why is 'base rate neglect' (ignoring the prior) so common? Give a real-life example where ignoring the base rate leads to incorrect conclusions.
  • Compare the interpretation of a 95% confidence interval and a 95% credible interval. Why is the latter more intuitive for most people?

Связанные уроки

  • aie-36-fine-tuning
Bayesian Statistics

0

1

Sign In