Comparing Data to a Standard

One task that is commonly performed in research is to compare the data you have to a specified standard or value. For example, is the average income for a community higher than the poverty level? Or, is the mean admission rate for institutions of higher learning in the United States higher than 0.50?

In previous chapters you learned how to compute characteristics of the distribution (e.g., the mean) that would allow us to answer these questions about the sample. For example, in 6  Summarizing and Visualizing Quantitative Attributes, we found that the average admission rate for our 230 sample institutions of higher learning was 0.68. Based on this, we could say that the average admission rate for our sample of 230 schools was higher than 0.50. But, is this true when we grow our sample to ALL institutions of higher learning? Is the mean admission rate for ALL institutions of higher learning in the United States higher than 0.50?

Drawing conclusions beyond the data we have is called inference, and the associated methods that allow researchers to allows us to learn from incomplete or imperfect data are referred to as statistical inference (Gelman & Hill, 2007). In this part of the textbook, you will learn about a set of statistical inferential methods that allow you to compare a sample of data to some standard in order to draw inferences about how the population compares to that standard (e.g., is the average income for a community higher than the poverty level?). To answer this type of inferential question, you will learn about how we quantify the amount of uncertainty associated with our sample numerical estimate when we have incomplete data (i.e., only a sample of data from the population we want to infer to). You will also learn how we then use that quantification in a one-sample hypothesis test to draw an inference about how a population parameter compares to that standard. Finally, you will learn about potential errors that can be made when conducting hypothesis tests and also assumptions underlying the methods we use to carry out these tests.

Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.