Section 1.5: Bayes' Theorem

Update probabilities with new evidence — the foundation of statistical inference.

Law of Total Probability (Review)

Before Bayes' Theorem, we need the Law of Total Probability. If B₁, B₂, ..., Bₖ partition the sample space:

P(A) = Σᵢ P(Bᵢ) × P(A|Bᵢ)

Two-Event Case

P(A) = P(B) × P(A|B) + P(B') × P(A|B')

Most Bayes problems use this two-scenario version.

Bayes' Theorem

Bayes' Theorem tells us how to "reverse" conditional probabilities:

P(A|B) = P(A) × P(B|A) / P(B)

Posterior = Prior × Likelihood / Evidence

Expanded Form

P(A|B) =

P(A) × P(B|A)

P(A)×P(B|A) + P(A')×P(B|A')

Terminology

Prior: P(A) before evidence
Likelihood: P(B|A)
Evidence: P(B)
Posterior: P(A|B) after evidence

Classic Example: Medical Testing

Problem: A disease affects 1% of the population. A test has 95% sensitivity (correctly identifies sick patients) and 90% specificity (correctly identifies healthy patients). If a person tests positive, what's the probability they have the disease?

Given Information

P(D) = 0.01

P(D') = 0.99

P(+|D) = 0.95

P(+|D') = 0.10

Solution

Step 1: Find P(+) using Total Probability

P(+) = P(D)P(+|D) + P(D')P(+|D')

= 0.01(0.95) + 0.99(0.10) = 0.0095 + 0.099 = 0.1085

Step 2: Apply Bayes' Theorem

P(D|+) = P(D)P(+|D) / P(+)

= 0.01(0.95) / 0.1085 = 0.0095 / 0.1085 ≈ 0.0876

The Surprising Result

Even with a 95% accurate test, a positive result means only about an 8.8% chance of disease! This counterintuitive result is due to the base rate fallacy: when the disease is rare, most positive tests are false positives.

Multiple Hypotheses

Bayes' Theorem generalizes to multiple mutually exclusive hypotheses H₁, H₂, ..., Hₖ:

P(Hᵢ|E) = P(Hᵢ) × P(E|Hᵢ) / Σⱼ P(Hⱼ) × P(E|Hⱼ)

Example: Three Factories

Factory A produces 50%, B produces 30%, C produces 20% of items. Defect rates: A=2%, B=3%, C=5%. Given a defective item, which factory most likely made it?

P(A|D) ∝ 0.50 × 0.02 = 0.010
P(B|D) ∝ 0.30 × 0.03 = 0.009
P(C|D) ∝ 0.20 × 0.05 = 0.010
Most likely: Factory A or C (tied)

Common Mistakes to Avoid

Confusing P(A|B) with P(B|A)

P(Disease|Positive) ≠ P(Positive|Disease). The test's accuracy (95%) is NOT the probability of having disease given a positive test!

Ignoring the Base Rate

The prior probability P(A) is crucial. For rare events, even highly accurate tests can have many false positives.

Forgetting the Denominator

P(B) = P(A)P(B|A) + P(A')P(B|A'). This total probability is essential for Bayes' Theorem.

Quick Reference

Bayes:P(A|B) = P(A)P(B|A)/P(B)

Total Prob:P(B) = ΣP(Aᵢ)P(B|Aᵢ)

Two-event:P(A)P(B|A)+P(A')P(B|A')

Odds form:Posterior ∝ Prior × Likelihood