Assignment 02

POLYNOMIAL EFFECTS

Published

January 13, 2023

The goal of this assignment is to give you experience fitting, interpreting, and evaluating models with polynomial effects. In this assignment, you will use the data from the file fertility.csv to explain variation in in infant mortality rates.

Instructions

This assignment needs to be completed using Quarto. Submit a zipped version of your entire assignment-02 project directory. Your QMD document should include a response to each of the questions below. Please adhere to the following guidelines for further formatting your assignment:

  • All graphics should be resized so that they do not take up more room than necessary and should have an appropriate caption.
  • Any typed mathematics (equations, matrices, vectors, etc.) should be appropriately typeset within the document using Markdown’s equation typesetting.
  • All syntax should be hidden (i.e., not displayed) unless specifically asked for.
  • Any messages or warnings produced (e.g., from loading packages) should also be hidden.

For each question, specify the question number (e.g., Question 2) using a level-2 (or smaller) header. This assignment is worth 15 points.


Model 1: Linear Effect of Female Education Level

  1. Create a scatterplot showing the relationship between female education level and infant mortality rates.

  2. Describe the relationship between female education level and infant mortality rates. Be sure to comment on the structural form, direction and strength of the relationship. Also comment on any potential observations that deviate from following this relationship (unusual observations or clusters of observations).

  3. Compute and report the Pearson correlation coefficient between female education level and infant mortality rate. Based on your response to Question 2, explain whether the Pearson correlation coefficient is an appropriate summary measure of the relationship.

  4. Regress infant mortality rates on female education level. For this model, posit a linear effect of female education level on infant mortality rate (Model 1). Create the scatterplot of the standardized residuals versus the fitted values from Model 1.

  5. Does this plot suggest problems about meeting the assumption that the average residual is zero at each fitted value? Explain.


Model 2: Quadratic Effect of Female Education Level

  1. Regress infant mortality rates on female education level. For this model, posit a quadratic effect of female education level on infant mortality rate (Model 2). Write the fitted equation using Equation Editor (or some other program that correctly types mathematical expressions).

  2. Compute, report, and interpret the likelihood ratio between Model 2 and Model 1.

  3. Carry out a likelihood ratio test to compare Model 1 and Model 2. Report the results from this test in a nicely formatted table.


Model 3: Control for Differences in Gross National Income (GNI)

  1. Regress infant mortality rates on female education level. For this model, posit a quadratic effect of female education level on infant mortality rate, and also control for differences in Gross National Income (Model 3). (Use all four levels of GNI.) Write the fitted equation using Equation Editor (or some other program that correctly types mathematical expressions).

  2. Carry out a likelihood ratio test to compare Model 2 and Model 3. Add the results from this test to the table you created in Question #8.


Adopting a Model

  1. Based on the results of the two likelihood ratio tests, which model will you adopt? Explain.

  2. Create the density plot of the marginal distribution of the standardized residuals for your adopted model, as well as the scatterplot of the standardized residuals versus the fitted values. Place these plots side-by-side in your printed document and, for the purposes of captioning, etc. treat them as two subfigures within a single figure.

  3. Based on the plots you created in Question 12, evaluate and comment on the tenability of each of the model assumptions.

Presenting the Results

  1. Mimic the format and structure of either of the first two tables in the Presenting Results from Many Fitted Regression Models section of the document Creating Tables to Present Statistical Results to create a table to present the numerical information from the three models you fitted in this assignment. Make sure the table you create also has an appropriate caption. If the table is too wide, change the page orientation in your word processing program to “Landscape”, rather than changing the size of the font. (Note: Only this table should be presented in landscape orientation…not your entire assignment!)

  2. Create a publication quality plot that displays the fitted curves from Model 3. Display four separate lines to show the effect of Gross National Income. The four lines should be displayed using different linetypes or colors (or both) so that they can be easily differentiated in the plot.