Day 23
Introduction to Correlation



EPSY 5261 : Introductory Statistical Methods

Learning Goals

At the end of this lesson, you should be able to …

  • Interpret scatterplots.
  • Explain when to use correlation to explain a relationship between variables.
  • Interpret a correlation.

Scatterplots

Example

Say you want to use the temperature to predict coffee sales at an outdoor stadium

Scatterplot

  • A plot of the relationship between two quantitative variables.
    • Explanatory variable: The variable you want to use as a predictor (goes on x-axis)
    • Response variable: The variable you want to predict (goes on y-axis)

What pattern(s) do you see?

  • Form/Trend?
  • Direction?
  • Strength?

Correlation

Correlation Coefficient (r)

  • Quantifies the strength and direction of linear relationship

Car Correlation Example

Interpreting a Correlation

  • Direction: Positive or negative?
    • If positive: As X increases, Y tends to increase.
    • If negative: As X increases, Y tends to decrease.
  • How strong is the linear relationship?
    • Weak (closer to 0)
    • Moderate (somewhere in the middle — maybe \(\pm.4\) to \(\pm.7\))
    • Strong (closer to \(\pm1\))

Form of the Relationship?

The correlation tells us nothing about the FORM of the relationship.

Always Plot Your Data

  • You should always plot your data.
  • Summaries don’t tell the whole story…
  • All of these plots have the same mean and SD (for both X and Y) and the same correlation coefficient.

Back to Our Example

  • Correlation between temperature and coffee sales is \(r=-0.741\).
  • How would you interpret this correlation?

Correlation \(\neq\) CAUSATION!

There may be a confounding variable (a variable that impacts both the explanatory and response variables) that is explaining the relationship.

Other Fun Facts about Correlation

  • Correlation between X and Y = Correlation between Y and X (symmetric).
  • Correlation has no units.
  • If units change (e.g., kg to lbs), correlation stays the same!

Introduction to Correlation Activity

Summary

  • We can use a correlation to describe a linear relationship between two quantitative variables.
  • A negative correlation implies a negative/indirect relationship.
  • A positive correlation implies a positive/direct relationship.
  • Correlation can only be between -1 and +1.
  • Correlations close to -1 or +1 are strong.
  • Correlations close to 0 are weak/non-existent.