Date: 25 May 2023
Instructor: Diego Perez Ruiz
Please note that this course is open to University of Manchester staff and students only.
To book on to this course, please email email@example.com for further details.
Cleaning data is one of the most important and time-consuming aspects of being a data analyst and researcher. Most courses typically teach statistical models or basic use of statistical software but few of these teach students how to efficiently clean real-world data.
This course will tackle this important topic. We will do this by introducing the tidyverse package in R. This is a large package that brings together some of the best tools for data cleaning and visualization in R. Inspired by the concept of “tidy data” the package enables users to import, merge, recode, restructure and plot data very efficiently. Half of the course will focus on data cleaning while the other half will focus on data visualization. The course will combine the use of lectures with hands-on practical sessions. In the practical part, we will be using real-world data to get the students used to the typical challenges they are expected to encounter when working with that. This will also help prepare them for working independently on their own data.
- To understand the concept of tidy data
- To learn how to efficiently connect multiple commands in R using the pipe operator
- To learn how to efficiently transform variables and prepare for analysis
- To learn how to work with factor variables To learn how to visualize data using R
- Filtering cases and selecting variables
- Working with factors
- Transforming variables
- Merging data
- Using the pipe operator
- Visualizing data in R
Basic knowledge of the R programming language
- R for Data Science - Garrett Grolemund, Hadley Wickham - https://r4ds.had.co.nz/