Day 03
Introduction to R and R Studio



EPSY 5261 : Introductory Statistical Methods

Reminder

Get started on Lab Assignment #1!

Learning Goals

At the end of this lesson, you should be able to …

  • Explain what R Studio is.
  • Explain why we use it.
  • Carry out a basic workflow in R Studio for data analysis.

Computing is fundamental to the practice of statistics and science more broadly.

What is R?

  • R is a programming language used for statistical computing.
  • It is used to write code to communicate to the computer what you want to do with your data.

What is R Studio?

R Studio is an interface for programming in R.

  • Allows integration of text and code to produce reports.

Why R Studio?

  • It’s free!
  • Open source
  • Point and click functionality makes it more user friendly than R alone.
  • Integration of code and text makes it easy to create reports
  • Supports reproducibility and open science goals

Reproducibility

In general:

  • Experiments/studies repeated with the same methods and analysis should yield the same results.

In computing:

  • Code and data should be provided alongside detailed documentation of analysis so that they can be repeated with the same results.

R Studio & Programming Introduction

R Studio Pane Layout: Pane 1

R Studio Pane Layout: Pane 2

R Studio Pane Layout: Pane 3

R Studio Pane Layout: Pane 4

Protip

You can change the location of the four panes by going to “Global Options > Pane Layout”.

Get the Course Data

ONE TIME ONLY: Download materials from the schedule page on the website.

  • There is a zipped folder containing the datasets (.csv).
  • Unzip the folder.

Put that folder somewhere easily accessible!

Some suggestions:

  • Desktop
  • “Grad School” Folder
  • On your desktop?
  • In your cloud?

Workflow (a.k.a. steps to using R Studio)

  1. Load libraries (Be sure to install any libraries you do not have.)
  2. Import data
  3. Ready to perform analysis!

Open R Script File

Now we start computing…RECALL

Functions

  • To perform analyses on your data in R, you will need to use functions.
  • Functions tell R Studio what to do with your data.
  • Remember, functions have this structure:
function_name()
  • They contain one or more arguments, which specify options for the function.
function_name(argument1, argument2, argument3)

Libraries/Packages

  • Functions live in libraries (a.k.a., packages)
  • Some packages come with your R and others need to be installed.
  • Install the {ggformula} and {tidyverse} packages.

Reminder: Use the Install button in the Packages tab in RStudio. You will only need to install the package once!

Load a Package

  • To use the functions from a particular package, we we will need to load the package (loading and installing are two different things!)
  • To load the package and its functions we use the library() function
  • You will need to load packages every time you open R Studio

Load Packages (cntd.)

General Example: library(libraryName)

  • library() is our function
  • libraryName will specify the library we want to load
  • To load the {ggformula} library we would use:
library(ggformula)

Install vs. Loading

  • You only need to install a package once
  • But, you need to load that package every time you open a new session in RStudio. (After you load it, all the functions in the library can be used.
  • Phone App analogy

Importing Data

  • Click Import Dataset > From Text (readr)
  • Navigate to your dataset by clicking Browse

Writing Code

Commenting Code

Executing Code In R Studio

Introduction to R Studio Activity

Summary

  • Load libraries (Be sure to install any libraries you do not have.)
  • Import data
  • Ready to perform analysis!