Intermediate R
Date: 25 May 2023
Time: 9.30am-4.30pm
Instructor: Diego Perez Ruiz
Level: Intermediate
Fee: £60
Please note that this course is open to University of Manchester staff and students only.
To book on to this course, please email cmi@manchester.ac.uk for further details.
Outline
Cleaning data is one of the most important and time-consuming aspects of being a data analyst and researcher. Most courses typically teach statistical models or basic use of statistical software but few of these teach students how to efficiently clean real-world data.
This course will tackle this important topic. We will do this by introducing the tidyverse package in R. This is a large package that brings together some of the best tools for data cleaning and visualization in R. Inspired by the concept of “tidy data” the package enables users to import, merge, recode, restructure and plot data very efficiently. Half of the course will focus on data cleaning while the other half will focus on data visualization. The course will combine the use of lectures with hands-on practical sessions. In the practical part, we will be using real-world data to get the students used to the typical challenges they are expected to encounter when working with that. This will also help prepare them for working independently on their own data.
Course objectives
- To understand the concept of tidy data
- To learn how to efficiently connect multiple commands in R using the pipe operator
- To learn how to efficiently transform variables and prepare for analysis
- To learn how to work with factor variables To learn how to visualize data using R
Skills covered
- Filtering cases and selecting variables
- Working with factors
- Transforming variables
- Merging data
- Using the pipe operator
- Visualizing data in R
Prerequisites
Basic knowledge of the R programming language
Recommended reading
- R for Data Science - Garrett Grolemund, Hadley Wickham - https://r4ds.had.co.nz/