Day 11
Hypothesis Testing with the t-Distribution



EPSY 5261 : Introductory Statistical Methods

Learning Goals

At the end of this lesson, you should be able to …

  • Describe the purpose of a hypothesis test.
  • List the steps of a hypothesis test.
  • Describe a parametric approach to hypothesis testing for a single mean.
  • List the assumptions for using the t-distribution to test a single mean.

Purpose of Hypothesis Testing

To test a claim about a population parameter

Steps of Hypothesis Testing

  1. Formulate a research question
  2. Write your hypotheses
  3. Find sampling distribution assuming the null hypothesis is true
  4. Compare sample summary to the distribution under the null hypothesis
  5. Get a p-value
  6. Make a decision based on the p-value
  7. Communicate your conclusion in context

Hypothesis Testing Example

Estimating Sampling Variability

  • Goal: Get an estimate for the sampling variability expected given this sample
    • Simulation (resampling methods)
    • Traditional Parametric Methods (a mathematical function)

Theoretical Distribution

  • Up until now, we have looked at approximations of the sampling distribution, with simulation.
  • Now, we will look at theoretical distributions of sample statistics

Sampling Distribution

  • Recall: in the Day 5 activity when we resampled we had a distribution centered at the sample statistic.
  • In hypothesis testing we want a distribution centered at the null hypothesized value.

Normal Distributions

  • Normal distributions are bell shaped, symmetric distributions characterized by:
    • Mean (center)
    • Standard deviation (variability)

Normal Distributions

Central Limit Theorem (CLT)

  • Mathematical theorem that informs us about the shape, center, and variation in the sampling distribution (i.e., the distribution of sample statistics)
  • It says if we know the population mean (\(\mu\) or \(\pi\)) and the population standard deviation (\(\sigma\)) then…
    • For random samples with a sufficiently large sample size, the distribution of sample statistics for a mean or a proportion is approximately normally distributed and centered at the value of the population parameter.

t-Distribution

  • The mathematical approximation for our sampling distribution when we don’t know the population standard deviation (\(\sigma\))
  • t-distribution is very similar in shape to the normal distribution, but with slightly thicker tails
  • t-distribution is always centered at 0
  • Because we are looking for an estimate for the variability in our sample we will be able to calculate that to be

\[ SE = \frac{s}{\sqrt{n}} \]

Degrees-of-Freedom

  • The t-distribution is characterized by degrees-of-freedom (df)
  • df is calculated based on sample size
  • The higher the df, the closer the t-distribution gets to the normal distribution

t-Distributions

Assumptions

  • The distribution of values in the population is normally distributed
    • We will check a histogram or density plot of our sample to confirm this
    • Note: If the sample size is greater than 30 we can use the t-distribution without our sample being normally distributed (because of the Central Limit Theorem)
  • The values in the population are independent from each other
    • We will think about this one based on the context

Use R Studio

  • Instead of doing a simulation we can use the t-distribution to help us get our estimate for the variability in the sampling distribution
  • Use functions in R Studio to also give us our p-value
  • We will explore the entire hypothesis test process in today’s activity!

Hypothesis Testing with the t-Distribution Activity

Summary

  • There are many steps to the hypothesis test (overview on Slide 9).
  • Hypothesis tests help us test a claim while taking into account sampling variability.
  • They provide one form of evidence to help answer a research question.
  • We can use a t-distribution to help us conduct our test as an alternative method to simulation.