Skip to content

Probability

Basic Probability Concepts

Sample Space and Events

  • Sample space (SS): The set of all possible outcomes of an experiment
  • Event: Any subset of the sample space
  • Probability: A number between 0 and 1 that measures the likelihood of an event

Probability Rules

  1. Legitimate probability values: 0P(A)10 \leq P(A) \leq 1 for any event AA
  2. Sum of probabilities: P(S)=1P(S) = 1 (the probability of the entire sample space is 1)
  3. Complement rule: P(Ac)=1P(A)P(A^c) = 1 - P(A), where AcA^c is the complement of AA

Addition Rule

For any two events AA and BB:

P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)

If AA and BB are mutually exclusive (disjoint, cannot both occur): P(AB)=P(A)+P(B)P(A \cup B) = P(A) + P(B)

Multiplication Rule

P(AB)=P(A)P(BA)P(A \cap B) = P(A) \cdot P(B | A)

If AA and BB are independent: P(AB)=P(A)P(B)P(A \cap B) = P(A) \cdot P(B)

Conditional Probability

P(BA)=P(AB)P(A)P(B | A) = \frac{P(A \cap B)}{P(A)}

The probability of BB given that AA has occurred. Note that P(A)P(A) must be greater than 0.

Independence

Events AA and BB are independent if the occurrence of one does not affect the probability of the other:

P(BA)=P(B)andP(AB)=P(A)P(B | A) = P(B) \quad \text{and} \quad P(A | B) = P(A)

Equivalently: P(AB)=P(A)P(B)P(A \cap B) = P(A) \cdot P(B)

Important: Independence is not the same as mutually exclusive. In fact, if two events are both mutually exclusive and both have non-zero probability, they cannot be independent.

Checking Independence

To check independence on the AP exam:

  1. Calculate P(A)P(A) and P(B)P(B) separately
  2. Calculate P(AB)P(A \cap B)
  3. Check whether P(AB)=P(A)P(B)P(A \cap B) = P(A) \cdot P(B)

Alternatively, check whether P(BA)=P(B)P(B | A) = P(B).

Disjoint vs Independent

PropertyDisjoint (Mutually Exclusive)Independent
DefinitionCannot occur togetherOccurrence of one does not affect the other
P(AB)P(A \cap B)00P(A)P(B)P(A) \cdot P(B)
P(AB)P(A \cup B)P(A)+P(B)P(A) + P(B)P(A)+P(B)P(A)P(B)P(A) + P(B) - P(A)P(B)

Bayes’ Theorem

P(AB)=P(BA)P(A)P(BA)P(A)+P(BAc)P(Ac)P(A | B) = \frac{P(B | A) \cdot P(A)}{P(B | A) \cdot P(A) + P(B | A^c) \cdot P(A^c)}

Used to find the probability of a “cause” given an observed “effect.” Useful for medical testing problems.

Two-Way Tables and Probability

Given a two-way table:

  • Marginal probability: probability based on a single variable (row or column total / grand total)
  • Joint probability: probability of both events occurring (cell / grand total)
  • Conditional probability: probability given a condition (cell / row total or cell / column total)

Discrete Random Variables

A random variable assigns a numerical value to each outcome in the sample space.

Probability Distribution

A table, graph, or formula that gives the probability (pip_i) for each value (xix_i) of the random variable XX.

Requirements:

  1. pi=1\sum p_i = 1
  2. 0pi10 \leq p_i \leq 1 for each ii

Mean (Expected Value)

μX=E(X)=xipi\mu_X = E(X) = \sum x_i \cdot p_i

The mean of a random variable is the long-run average of its values over many repetitions.

Variance and Standard Deviation

σX2=(xiμX)2pi=xi2piμX2\sigma_X^2 = \sum (x_i - \mu_X)^2 \cdot p_i = \sum x_i^2 \cdot p_i - \mu_X^2

σX=σX2\sigma_X = \sqrt{\sigma_X^2}

Rules for Means and Variances

For random variables XX and YY, and constants aa and bb:

  • μa+bX=a+bμX\mu_{a + bX} = a + b\mu_X
  • μX+Y=μX+μY\mu_{X + Y} = \mu_X + \mu_Y (always)
  • σa+bX2=b2σX2\sigma_{a + bX}^2 = b^2\sigma_X^2
  • If XX and YY are independent: σX+Y2=σX2+σY2\sigma_{X + Y}^2 = \sigma_X^2 + \sigma_Y^2 and σXY2=σX2+σY2\sigma_{X - Y}^2 = \sigma_X^2 + \sigma_Y^2

Binomial Distributions

A binomial setting has four conditions:

  1. Binary: Each observation has two possible outcomes (success/failure)
  2. Independent: Observations are independent (or approximately independent if sampling without replacement and the population is at least 10 times the sample)
  3. n trials: A fixed number of trials nn
  4. p probability: The probability of success pp is the same for each trial

Binomial Probability

P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k} p^k (1-p)^{n-k}

Mean and Standard Deviation

μ=np,σ=np(1p)\mu = np, \quad \sigma = \sqrt{np(1-p)}

Geometric Distributions

A geometric setting has three conditions:

  1. Binary: Each trial has two outcomes
  2. Independent: Trials are independent
  3. p probability: The probability of success is the same for each trial

The random variable XX counts the number of trials until the first success.

Geometric Probability

P(X=k)=(1p)k1pP(X = k) = (1-p)^{k-1} \cdot p

Mean

μ=1p\mu = \frac{1}{p}

Normal Distributions and the Central Limit Theorem

Central Limit Theorem (CLT)

For a sufficiently large sample size nn (typically n30n \geq 30), the sampling distribution of xˉ\bar{x} is approximately normal, regardless of the shape of the population distribution.

μxˉ=μ,σxˉ=σn\mu_{\bar{x}} = \mu, \quad \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

The CLT allows us to use normal probability calculations for sample means even when the population is not normally distributed.

Sampling Distribution of p^\hat{p}

For a sample proportion p^\hat{p} with population proportion pp and sample size nn:

μp^=p,σp^=p(1p)n\mu_{\hat{p}} = p, \quad \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

Approximately normal when np10np \geq 10 and n(1p)10n(1-p) \geq 10.

Common Pitfalls

  • Confusing P(AB)P(A | B) with P(BA)P(B | A)
  • Assuming events are independent without justification
  • Confusing disjoint with independent
  • Forgetting to check the 10% condition for the binomial approximation
  • Misapplying the central limit theorem with small sample sizes $
  • Confusing the mean of a random variable with its most probable value