Statistical Inference
Introduction to Inference
Statistical inference uses sample data to draw conclusions about a population. There are two main types: confidence intervals (estimating a parameter) and hypothesis tests (testing a claim about a parameter).
Confidence Intervals
A confidence interval provides a range of plausible values for an unknown population parameter, along with a level of confidence.
General Form
Confidence Level
The confidence level (typically 90%, 95%, or 99%) is the long-run proportion of intervals that would capture the true parameter if the sampling process were repeated many times. It does not mean there is a 95% probability that the specific interval contains the parameter.
Confidence Interval for a Proportion
Conditions:
- Random sample (or random assignment)
- and (large enough sample)
- Population is at least (independence / 10% condition)
Where is the sample proportion and is the critical value (1.645 for 90%, 1.960 for 95%, 2.576 for 99%).
Confidence Interval for a Mean
Conditions:
- Random sample
- Population distribution is approximately normal (check with normal probability plot, or for CLT)
- known: -interval; unknown: -interval
Where is the critical value from the -distribution with degrees of freedom.
Interpreting Confidence Intervals
- “We are 95% confident that the true population proportion is between 0.42 and 0.58” means that the method produces an interval that captures the true parameter 95% of the time
- The margin of error decreases with larger sample size and lower confidence level
- Wider intervals give more confidence but less precision
Hypothesis Testing
A hypothesis test evaluates whether the observed sample data provides evidence for or against a claim about a population parameter.
Structure of a Hypothesis Test
- State hypotheses: Null () and alternative ()
- Check conditions: Determine the appropriate test
- Calculate the test statistic: Measure how far the observed statistic is from the null
- Find the p-value: Probability of obtaining a result at least as extreme as the observed, assuming is true
- Make a decision: Compare p-value to the significance level
- State conclusion in context
Null and Alternative Hypotheses
- (null hypothesis): The “status quo” — the parameter equals a specific value (e.g., , )
- (alternative hypothesis): What we are trying to find evidence for (e.g., , , )
Significance Level ()
The significance level is the threshold for deciding whether the evidence against is strong enough. Common values: .
p-Value
The p-value is the probability of obtaining a test statistic as extreme as or more extreme than the observed value, assuming is true.
- Small p-value (): Strong evidence against ; reject
- Large p-value (): Insufficient evidence against ; fail to reject
Type I and Type II Errors
| Decision | True | False |
|---|---|---|
| Reject | Type I Error () | Correct (Power = ) |
| Fail to Reject | Correct | Type II Error () |
- Type I Error: Rejecting when it is actually true (false positive). Probability =
- Type II Error: Failing to reject when it is actually false (false negative). Probability =
- Power (): The probability of correctly rejecting a false (detecting a real effect)
Power increases with: larger sample size, larger effect size, higher , and lower .
Tests for Proportions
One-Sample z-Test for a Proportion
Two-Sample z-Test for Difference of Proportions
Where is the pooled proportion.
Tests for Means
One-Sample t-Test
Degrees of freedom:
Two-Sample t-Test
Use conservative degrees of freedom: or technology.
Paired t-Test
For matched pairs or before/after data: compute the differences and run a one-sample t-test on the differences.
Chi-Square Tests
Goodness of Fit
Tests whether observed frequencies match expected frequencies based on a specified distribution.
Test for Independence
Tests whether two categorical variables are independent using a two-way table.
Expected count for each cell:
Conditions: All expected counts .
Test for Homogeneity
Tests whether the distribution of one categorical variable is the same across several populations. Same formula and conditions as the test for independence, but sampling is from separate populations rather than one population classified two ways.
Inference for Regression Slope
Hypothesis Test for Slope
Common Pitfalls
- Confusing the p-value with the probability that is true
- Saying “accept ” instead of “fail to reject ”
- Forgetting to check conditions before performing inference
- Interpreting a confidence interval as “there is a 95% probability the parameter is in this interval”
- Using a z-test for a mean when is unknown (should use t-test)
- Confusing Type I and Type II errors