Simple Regression Worksheet

Author

Small Group

Published

September 26, 2024

Directions

Work with one or more other students to complete each of the tasks in this document. As part of this, include the syntax you use to complete each tasks in a script file. As you write your script file, adhere to good coding practices:

  • Include comments
  • Include spaces
  • Include a line break after every pipe operator you use.

You will also need to answer some questions in a Word or Google document.


Task 1: Import Data

Import the riverview.csv data into an object named city. Also, examine the data codebook so you are familiar with the different attributes.


Task 2: Marginal Distribution of Seniority-level

Create a density plot of the years of seniority attribute (seniority). You may also want to produce summary statistics for this attribute. Describe the shape, center (i.e., typical value), and variability. Be sure to use the data context in this description.


Task 3: Relationship between Seniority-level and Income

Create a scatterplot of the relationship between seniority-level and income. In this plot assume income is the outcome and seniority-level is the predictor. Describe this relationship by indicating the functional form, direction, magnitude, strength, and any potential outliers. Be sure to use the data context in this description.


Task 4: Compute the Correlation Coefficient

Compute and report the correlation coefficient between seniority-level and income.


Task 5: Fit the Regression Model

Fit the regression model that uses seniority-level to predict variation in income. Write the fitted equation. Be sure you can write the fitted equation using Equation Editor in Microsoft Word/Google Docs. (This includes adding any hats, or subscripts!)


Task 6: Coefficient Interpretations

Interpret the intercept and slope from the fitted equation.


Task 7: Compute the Sum of Squared Error (SSE) for the Fitted Model

Compute the SSE for the model. Include the syntax you used to compute this.


Task 8: Fit an Intercept-Only Model and Compute the SSE for It

Fit an intercept-only model predicting variation in incomes. Use that model to compute the SSE. Include the syntax you used to compute this.


Task 9: Compute the Proportion Reduction in Error (PRE)

Use the two SSE measures to compute the PRE. Show your work. Also interpret the value using the data’s context.