This goal of this assignment is to give you more experience evaluating the assumptions underlying regression models. Submit your responses to each of the questions below in a printed document. All graphics should be resized so that they do not take up more room than necessary and also should have an appropriate caption. This assignment is worth 15 points. (Each question is worth 1 point unless otherwise noted.)
Research. Teaching. Service. The trifecta upon which that almost every university instructor is evaluated, and, ultimately compensated. One way which academic administrators judge teaching quality is through teachers’ course evaluations. While we know evaluation scores are not perfectly measures of teaching quality, nonetheless, they do play a role in the tenure and promotion process. Unfortunately, many other non-teaching related factors are also associated with evaluation scores (e.g., professor’s ethnicity, professor’s sex).
For this part of the assignment, you will examine whether instructor attractiveness explains differences in course evaluation scores—and thus on earnings differences. To do so, you will use the data in the evaluations.csv file to fit a regression model that uses professors’ beauty ratings to predict the variation in course evaluation ratings.
Fit the regression model to predict the variation in course evaluation ratings using professors’ beauty ratings. You will use the output from the fitted model to answer the questions in Part I.
Create and include the density plot for the outcome. Does the distribution foreshadow problems for the normality assumption? Explain.
Create and include the scatterplot of the outcome vs. the predictor. Include the loess smoother in the plots. Does this relationship foreshadow problems for the linearity assumption? Explain.
Create and include the density plot of the marginal distribution of the standardized residuals from the fitted model. Add the confidence envelope for the normal distribution. Does this plot suggest problems about meeting the normality assumption? Explain.
Create and include the scatterplot of the standardized residuals versus the fitted values from the fitted model. In the plot identify observation with extreme residuals (\(\leq-3\) or \(\geq3\)) by indicating the row number of that observation in the plot.
Does this plot suggest problems about meeting the linearity assumption? Explain.
Does this plot suggest problems about meeting the homogeneity of variance assumption? Explain.
Is the independence assumption tenable? Explain.
Human overpopulation is a growing concern and has been associated with depletion of Earth’s natural resources (water is a big one that ) and degredation of the environment. This, in turn, has social and economic consequences such as global tension over resources such as water and food, higher cost of living and higher unemployment rates. For this part of the assignment, you will use the file fertility.csv to fit a model in order to explore the effects of contraceptive use on fertility rates.
Fit the regression model to predict the variation in fertility rates using contraception use, female education, and infant mortality rate (three predictors). You will use the output from the fitted model to answer the questions in Part II.
Create and include the density plot for the outcome. Does the distribution foreshadow problems for the normality assumption? Explain.
Create and include the scatterplot of the outcome vs. each predictor (three total). Include the loess smoother in each of the plots. Do any of these relationships foreshadow problems for the linearity assumption? Explain. (2pts.)
Create and include the density plot of the marginal distribution of the standardized residuals from the fitted model. Add the confidence envelope for the normal distribution. Does this plot suggest problems about meeting the normality assumption? Explain.
Create and include the scatterplot of the standardized residuals versus the fitted values from the fitted model. In the plot identify observation with extreme residuals (\(\leq-3\) or \(\geq3\)) by indicating the country associated with that observation in the plot.
Does this plot suggest problems about meeting the linearity assumption? Explain.
Does this plot suggest problems about meeting the homogeneity of variance assumption? Explain.
Is the independence assumption tenable? Explain.