Section 1.5: Bayes' Theorem
Update probabilities with new evidence — the foundation of statistical inference.
Law of Total Probability (Review)
Before Bayes' Theorem, we need the Law of Total Probability. If B₁, B₂, ..., Bₖ partition the sample space:
Two-Event Case
P(A) = P(B) × P(A|B) + P(B') × P(A|B')
Most Bayes problems use this two-scenario version.
Bayes' Theorem
Bayes' Theorem tells us how to "reverse" conditional probabilities:
Posterior = Prior × Likelihood / Evidence
Expanded Form
P(A|B) =
P(A) × P(B|A)
P(A)×P(B|A) + P(A')×P(B|A')
Terminology
- Prior: P(A) before evidence
- Likelihood: P(B|A)
- Evidence: P(B)
- Posterior: P(A|B) after evidence
Classic Example: Medical Testing
Problem: A disease affects 1% of the population. A test has 95% sensitivity (correctly identifies sick patients) and 90% specificity (correctly identifies healthy patients). If a person tests positive, what's the probability they have the disease?
Given Information
Solution
Step 1: Find P(+) using Total Probability
P(+) = P(D)P(+|D) + P(D')P(+|D')
= 0.01(0.95) + 0.99(0.10) = 0.0095 + 0.099 = 0.1085
Step 2: Apply Bayes' Theorem
P(D|+) = P(D)P(+|D) / P(+)
= 0.01(0.95) / 0.1085 = 0.0095 / 0.1085 ≈ 0.0876
The Surprising Result
Even with a 95% accurate test, a positive result means only about an 8.8% chance of disease! This counterintuitive result is due to the base rate fallacy: when the disease is rare, most positive tests are false positives.
Multiple Hypotheses
Bayes' Theorem generalizes to multiple mutually exclusive hypotheses H₁, H₂, ..., Hₖ:
Example: Three Factories
Factory A produces 50%, B produces 30%, C produces 20% of items. Defect rates: A=2%, B=3%, C=5%. Given a defective item, which factory most likely made it?
P(A|D) ∝ 0.50 × 0.02 = 0.010
P(B|D) ∝ 0.30 × 0.03 = 0.009
P(C|D) ∝ 0.20 × 0.05 = 0.010
Most likely: Factory A or C (tied)
Common Mistakes to Avoid
Confusing P(A|B) with P(B|A)
P(Disease|Positive) ≠ P(Positive|Disease). The test's accuracy (95%) is NOT the probability of having disease given a positive test!
Ignoring the Base Rate
The prior probability P(A) is crucial. For rare events, even highly accurate tests can have many false positives.
Forgetting the Denominator
P(B) = P(A)P(B|A) + P(A')P(B|A'). This total probability is essential for Bayes' Theorem.