Probability and Statistics
Probability and Statistics
Probability and statistics appear on both papers of the Leaving Certificate Mathematics examination. This topic covers counting principles, probability rules, distributions, hypothesis testing, and Statistical measures.
Counting Principles
Fundamental Counting Principle (OL/HL)
If task A can be done in ways and task B in ways, then A followed by B can be done in ways. This extends to any finite sequence of tasks: multiply the number of choices at Each step.
Example (OL): How many ways can 3 different books be arranged on a shelf?
Permutations (OL/HL)
The number of arrangements of objects from distinct objects:
Example (HL): How many ways can a committee chair, secretary, and treasurer be chosen from 10 People?
Combinations (OL/HL)
The number of ways to choose objects from distinct objects (order does not matter):
Key identity: (choosing to include equals choosing to Exclude).
Proof of :
Example (OL): How many ways to choose 4 students from a class of 15?
Arrangements with Repetition (HL)
The number of arrangements of objects where some objects are identical:
Where are the counts of each identical group.
Example (HL): How many arrangements of the word “STATISTICS”?
Total letters: 10. S appears 3 times, T appears 3 times, I appears 2 times.
Pascal’s Triangle and Binomial Coefficients (HL)
Each entry in Pascal’s triangle is the sum of the two entries above it. The Th entry in row Is .
Pascal’s identity: .
Proof of Pascal’s identity. Consider choosing people from people. Fix one particular Person, say Alice. Either Alice is chosen (leaving ways to choose the remaining from others) or Alice is not chosen (leaving ways to choose all From others). These cases are mutually exclusive and exhaustive.
Binomial theorem: .
Example (HL): Find the coefficient of in the expansion of .
So the coefficient is .
Probability
Basic Probability (OL/HL)
For an event :
P(A) = \frac{\mathrm{number of favourable outcomes}{\mathrm{total number of outcomes}Properties: and .
Axioms of probability (Kolmogorov):
- for any event
- where is the sample space
- For mutually exclusive events :
Rules of Probability (OL/HL)
Addition rule:
Multiplication rule (independent events):
Two events are independent if and only if . This is not the same as Mutually exclusive (which means ).
Conditional probability:
Law of Total Probability (HL): If partition the sample space:
Bayes’ Theorem (HL):
This is extremely important in real-world applications: medical testing, spam filtering, and machine Learning all rely on Bayes’ theorem.
Extended Bayes’ Theorem. If partition the sample space:
Example (HL): A bag contains 4 red and 6 blue marbles. Two are drawn without replacement. Find The probability both are red.
P(\mathrm{both red) = \frac{4}{10} \times \frac{3}{9} = \frac{12}{90} = \frac{2}{15}Example (HL) — Bayes’ theorem: A test for a disease has 95% sensitivity (true positive rate) And 2% false positive rate. The disease prevalence is 1%. Find the probability that a person who Tests positive actually has the disease.
P(D) = 0.01$$P(D') = 0.99$$P(+|D) = 0.95$$P(+|D') = 0.02.
Only about 32.4% of positive tests are true positives. This is the “base rate fallacy.”
Example (HL) — Extended Bayes’: A factory has three machines producing bolts. Machine A Produces 50% of bolts with 2% defect rate, Machine B produces 30% with 3% defect rate, Machine C Produces 20% with 1% defect rate. A randomly selected bolt is defective. What is the probability it Came from Machine B?
P(B|\mathrm{def) = \frac{P(\mathrm{def|B)P(B)}{P(\mathrm{def|A)P(A) + P(\mathrm{def|B)P(B) + P(\mathrm{def|C)P(C)}Probability Trees (OL)
A probability tree diagram is useful for multi-stage experiments.
Example (OL): A coin is tossed three times. Find the probability of getting exactly two heads.
There are equally likely outcomes. The favourable outcomes are HHT, HTH, THH.
P(\mathrm{exactly 2 heads) = \frac{3}{8}Alternatively using combinations:
Independence vs. Mutually Exclusive (HL)
These concepts are frequently confused:
| Property | Independent | Mutually Exclusive | | ------------------- | ------------------------------------------- | ------------------ | ---- | ------- | | Definition | | | | Meaning | Occurrence of one does not affect the other | Cannot both occur | | If | | |
Important: If two events with positive probability are mutually exclusive, they cannot be Independent (since when both probabilities are positive).
Discrete Probability Distributions
Expected Value (OL/HL)
For a random variable with values and probabilities :
Variance:
\mathrm{Var(X) = E(X^2) - [E(X)]^2 = \sum x_i^2 p_i - [E(X)]^2Standard deviation: \sigma = \sqrt{\mathrm{Var(X)}.
Properties:
- \mathrm{Var(aX + b) = a^2\mathrm{Var(X)
Proof of \mathrm{Var(aX + b) = a^2\mathrm{Var(X):
\mathrm{Var(aX + b) = E[(aX + b)^2] - [E(aX + b)]^2 = a^2[E(X^2) - (E(X))^2] = a^2\mathrm{Var(X)Example (OL): A fair die is rolled. Find and \mathrm{Var(X).
\mathrm{Var(X) = \frac{91}{6} - \frac{49}{4} = \frac{182 - 147}{12} = \frac{35}{12} \approx 2.917Binomial Distribution (HL)
X \sim \mathrm{Bin(n, p) where is the number of trials and is the probability of success.
E(X) = np, \quad \mathrm{Var(X) = np(1-p)Conditions: fixed Independent trials, two outcomes, constant .
Proof that for X \sim \mathrm{Bin(n, p). Let be the indicator variable for Success on trial (so with probability p$$X_i = 0 with probability ). Then and So .
Example (HL): A multiple-choice test has 20 questions, each with 4 options. A student guesses Every answer. Find the probability of getting exactly 5 correct.
X \sim \mathrm{Bin(20, 0.25)Poisson Distribution (HL)
X \sim \mathrm{Po(\lambda) models the number of events occurring in a fixed interval when events Happen independently at a constant average rate .
E(X) = \lambda, \quad \mathrm{Var(X) = \lambdaConditions: events occur independently, at a constant average rate, and singly (not in Clusters).
Example (HL): A call centre receives an average of 4.5 calls per minute. Find the probability of Receiving exactly 3 calls in a given minute.
X \sim \mathrm{Po(4.5)Example (HL): Using the same call centre, find .
Normal Distribution (HL)
. The standard normal has .
Empirical rule (68-95-99.7):
Why the normal distribution is special. The Central Limit Theorem states that the mean of a Large number of independent random variables is approximately normally distributed, regardless of The original distribution. This is why the normal distribution appears everywhere in nature.
Example (HL): Exam marks are normally distributed with \mu = 60$$\sigma = 10. Find the Probability a student scores above 75.
Normal Approximation to the Binomial (HL)
When and :
X \sim \mathrm{Bin(n, p) \approx N(np, np(1-p))Apply a continuity correction: .
Why continuity correction is needed. The binomial is discrete (defined on integers) while the Normal is continuous. Without the correction, we would compute But the Normal approximation for would miss the entire bar at .
Bernoulli Trials (HL)
A Bernoulli trial is a single experiment with two outcomes: success (probability ) and failure (probability ). The binomial distribution models independent Bernoulli trials.
Statistical Measures
Measures of Central Tendency (OL/HL)
Mean:
Median: The middle value when data is arranged in order.
Mode: The most frequently occurring value.
Measures of Spread (OL/HL)
Range: \mathrm{Range = \mathrm{maximum - \mathrm{minimum
Interquartile range (IQR): \mathrm{IQR = Q_3 - Q_1
Standard deviation:
Computational formula:
Grouped Data (OL/HL)
For grouped data with class midpoints and frequencies :
Box Plots (HL)
A box plot displays five statistics: minimum, Median, Maximum.
Outlier: A value below Q_1 - 1.5 \times \mathrm{IQR or above Q_3 + 1.5 \times \mathrm{IQR.
Example (HL): A data set has Q_1 = 25$$Q_3 = 45Minimum Maximum . Identify Any outliers.
\mathrm{IQR = 45 - 25 = 20.
Lower fence: . Since the minimum is No low outliers.
Upper fence: . Since the maximum is No high outliers.
Skewness (HL)
- Positive skew: Mean > Median (right tail is longer). The mode is less than the median.
- Negative skew: Mean < Median (left tail is longer). The mode is greater than the median.
- Symmetric: Mean = Median.
In a box plot, positive skew means the right whisker is longer; negative skew means the left whisker Is longer.
Hypothesis Testing (HL)
Steps
- State the null hypothesis and the alternative hypothesis .
- Choose the significance level ( 5%).
- Calculate the test statistic.
- Determine the critical value(s) or p-value.
- Compare and make a decision.
- State the conclusion in context.
One-Tailed vs. Two-Tailed Tests (HL)
| Feature | One-tailed | Two-tailed |
|---|---|---|
| or | ||
| Critical region | One tail only | Both tails |
| Critical value | ||
| p-value | One tail area | Two tail areas combined |
:::caution Choose one-tailed or two-tailed before collecting data. Never decide after seeing the Results. :::
z-test for a Proportion (HL)
Example: A coin is tossed 200 times and lands on heads 115 times. Test at the 5% significance Level whether the coin is biased.
(coin is fair), (coin is biased).
Under : \mu = np = 100$$\sigma = \sqrt{np(1-p)} = \sqrt{50} \approx 7.071.
Using the normal approximation with continuity correction: .
Critical values at (two-tailed): .
Since We reject . There is sufficient evidence to suggest the coin is biased.
t-test for a Mean (HL)
When the population standard deviation is unknown and the sample size is small (), use the T-distribution with degrees of freedom.
Example (HL): A sample of 8 measurements has and . Test at the 5% Level whether the population mean is 25.
H_0: \mu = 25$$H_1: \mu \neq 25.
Degrees of freedom . The critical value from t-tables at (two-tailed, 7 df) is Approximately .
Since We do not reject . There is insufficient evidence to conclude The population mean differs from 25.
Type I and Type II Errors (HL)
| Error Type | Description | Probability |
|---|---|---|
| Type I | Rejecting when is true | |
| Type II | Failing to reject when is false |
The power of a test is : the probability of correctly rejecting a false .
Confidence Intervals (HL)
A 95% confidence interval for the population mean when is known:
When is unknown (use sample standard deviation with degrees of freedom):
Interpretation: If we were to take many samples and construct a 95% confidence interval from Each, approximately 95% of those intervals would contain the true population mean. It does NOT mean There is a 95% probability that lies in any particular interval.
Correlation and Regression (HL)
Scatter Plots and Correlation
A scatter plot displays pairs of data points. The Pearson correlation coefficient Measures linear association:
. Values near indicate strong linear correlation. means no linear Correlation (but there may be a non-linear relationship).
Important: Correlation does not imply causation.
Coefficient of Determination (HL)
The coefficient of determination represents the proportion of variance in explained by the Linear relationship with .
If Then Meaning 64% of the variation in is accounted for by the linear Regression on . The remaining 36% is due to other factors.
Line of Best Fit (Least Squares) (HL)
The regression line of on is:
Where:
The least squares line minimises .
Example (HL): For the data set Find the correlation Coefficient and regression line.
n = 5$$\sum x = 15$$\sum y = 25$$\sum xy = 1(3)+2(5)+3(4)+4(7)+5(6) = 3+10+12+28+30 = 83 \sum x^2 = 1+4+9+16+25 = 55$$\sum y^2 = 9+25+16+49+36 = 135.
Regression line: .
: 64% of the variation in is explained by the linear relationship with .
Warning: The regression line of on should only be used for prediction within the range Of the data (interpolation). Extrapolation beyond the data range is unreliable.
Worked Examples
See the examples integrated throughout the sections above.
Common Pitfalls
- Confusing permutations and combinations — order matters for permutations, not for
combinations. Ask: “Is
\{A,B,C\}the same as\{C,B,A\}?” - Forgetting independence in the multiplication rule — always check whether events are independent before using .
- Normal distribution — always standardise before using tables. .
- Hypothesis testing — state and before computing. The conclusion must address the original question in context.
- Binomial vs normal — use the binomial for discrete counts and normal for continuous measurements. Use the normal approximation only when the conditions are met.
- Continuity correction — when approximating a binomial with a normal, apply a continuity correction (e.g., ).
- Base rate fallacy — in Bayes’ theorem problems, the prior probability matters enormously. A test with 99% accuracy can still have a low positive predictive value if the condition is rare.
- Confusing independence with mutual exclusivity — mutually exclusive events with positive probability are never independent.
- Forgetting to check conditions for the Poisson distribution — events must occur independently at a constant rate.
- Interpreting confidence intervals incorrectly — a 95% CI does not mean there is a 95% probability that lies in the interval.
Practice Questions
Ordinary Level
- How many ways can 5 students be selected from a class of 20?
- A fair six-sided die is rolled. Find the probability of rolling a number greater than 4.
- A bag contains 3 red and 5 green balls. Two are drawn at random without replacement. Find the probability both are green.
- Find the mean, median, and mode of: 3, 5, 5, 7, 8, 9, 12.
- Given and \mathrm{Var(X) = 9Find .
- In a class of 35 students, 20 study maths, 15 study physics, and 8 study both. How many study neither?
Higher Level
- X \sim \mathrm{Bin(15, 0.3). Find .
- Heights are normally distributed with \mu = 170\mathrm{ cm and \sigma = 8\mathrm{ cm. Find the probability a randomly selected person is between 160 cm and 180 cm tall.
- A sample of 8 measurements has and . Test at the 5% level whether the population mean is 25.
- For the data set Find the correlation coefficient and the equation of the regression line of on .
- Prove that .
- A test for a disease has 99% sensitivity and 1% false positive rate. If 0.5% of the population has the disease, find the probability that a positive test result is a true positive.
- X \sim \mathrm{Bin(80, 0.6). Use the normal approximation with continuity correction to estimate .
- Find the number of ways to arrange the letters of the word “STATISTICS”.
- A helpdesk receives an average of 3.2 emails per hour. Find the probability of receiving more than 5 emails in a given hour.
- A 95% confidence interval for a mean is based on a sample of size . Find and the margin of error.
- Prove that \mathrm{Var(aX + b) = a^2\mathrm{Var(X).
- Two events and satisfy P(A) = 0.4$$P(B) = 0.5And . Determine whether and are independent.
- Find the coefficient of in the expansion of .
- Explain the difference between Type I and Type II errors in hypothesis testing.
- X \sim \mathrm{Po(2.5). Find .
- A factory produces items with a defect rate of 5%. In a sample of 200 items, use the normal approximation to estimate the probability of finding more than 12 defective items.
- Prove Pascal’s identity: .
- A random variable has and \mathrm{Var(X) = 4. Find and \mathrm{Var(3X - 2).
- Two dice are rolled. Find the probability that the sum is prime given that at least one die shows a 4.
- Explain why a high correlation coefficient does not necessarily mean that one variable causes the other. Give an example.
Extended Practice
- Prove that for any two events and : .
- Find the expected value and variance of X \sim \mathrm{Po(4).
- A sample of 50 students has a mean study time of 15.2 hours per week with standard deviation 4.3 hours. Construct a 95% confidence interval for the population mean.
- Use the binomial expansion to find the first four terms of .
- A quality control inspector checks items from a production line. The probability of a defect is 0.02. In a batch of 100 items, find the probability of at most 3 defects using the Poisson approximation.
- Given the data below, calculate the Spearman rank correlation coefficient:
| 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|
| 3 | 1 | 4 | 2 | 5 |
- Prove that if and are independent, then so are and .
- The regression line of on is and the regression line of on is . Find \bar{x}$$\bar{y}And the correlation coefficient .
8.4 Poisson Approximation to the Binomial
When is large, is small, and is moderate, the binomial distribution can be approximated By a Poisson distribution with :
\mathrm{Bin(n, p) \approx \mathrm{Po(np)This avoids calculating large binomial coefficients. As a rule of thumb, the approximation is good When and .
Example (HL): A machine produces items with a defect rate of 2%. In a batch of 200 items, find The probability of at most 3 defects using the Poisson approximation.
.
8.5 Choosing the Right Distribution
| Distribution | When to use | Parameters |
|---|---|---|
| Binomial | Fixed number of trials, two outcomes, constant | |
| Poisson | Events occur at constant rate, independently | |
| Normal | Continuous data, symmetric bell shape |
:::tip If a question mentions “per unit time” or “per unit area” and events are rare, think Poisson. If it mentions “out of trials,” think Binomial. :::
8.6 Relationship Between Distributions
The three main distributions covered in this topic are connected:
- Binomial to Normal: When is large and is not too close to 0 or 1, \mathrm{Bin(n,p) \approx N(np, np(1-p)).
- Binomial to Poisson: When is large and is small, \mathrm{Bin(n,p) \approx \mathrm{Po(np).
- Poisson to Normal: When is large ( ), \mathrm{Po(\lambda) \approx N(\lambda, \lambda).
8.7 Law of Large Numbers (HL - awareness)
The law of large numbers states that as the number of trials increases, the sample mean converges to The expected value. Formally, for i.i.d. Random variables with mean :
\bar{X}_n = \frac{X_1 + X_2 + \cdots + X_n}{n} \to \mu \mathrm{ as n \to \inftyThis is why a coin tossed many times gives a proportion of heads close to 0.5, even if short Sequences may deviate significantly.
Summary
This topic covers the mathematical techniques and concepts related to probability and statistics, including key theorems, methods, and problem-solving approaches.
Key concepts include:
- measures of central tendency and spread
- probability distributions (binomial, normal)
- hypothesis testing
- correlation and regression
- sampling methods
Regular practice with a variety of question types is essential to build fluency and confidence in applying these mathematical techniques.