Day 27
Assumptions for Linear Regression
EPSY 5261 : Introductory Statistical Methods
Learning Goals
At the end of this lesson, you should be able to …
- Check assumptions for a linear model.
Assumptions for a Linear Model
- L — Pattern is linear (plot your data)
- I — Observations are independent from each other (i.e., At a particular X value, one observation’s Y value does not affect observations’ Y values)
- N — The outcome variable is normally distributed
- Some also add “no outliers”
- E — Equal variance
- Variability in Y is constant at each X value (no fanning out of residuals on residual plot)
Linearity
Independence
- Consider all the coffee sales at a particular temperature
- Reasonable to assume that one day’s coffee sales does not affect another day’s sales
Normality
Equal Variance
- Should see that the range of residuals is constant (about the same) at each X value
A Residual Plot Showing Unequal Variance
Residual Plot to Evaluate Linearity
- Random pattern around the line residual = 0
- Means your model is not systematically over or under-predicting
A Residual Plot Showing Non-linearity
Assumptions for Linear Regression Activity
Summary
- For hypothesis test results for regression to be valid, we need to meet the following assumptions:
- Linearity
- Independence
- Normality
- Equal variances
- The mnemonic LINE can be used to hep remember the assumptions.