Search
Search type

Cathie Marsh Institute for Social Research (CMI)

Introduction to Cluster Analysis

Date: 7 February 2020
Time: 10am – 4.30pm
Instructor: Kitty Lymperopoulou
Level: Introductory
Fee: £195 (£140 for those from educational, government and charitable institutions). 

CMI offers up to five subsidised places at a reduced rate of £60 per course day to research staff and students within Humanities at The University of Manchester. These places are awarded in order of application. 

Humanities PGR students at The University of Manchester can apply for a methods@manchester bursary to help cover their costs. All applications will be considered on a case-by-case basis and applicants will be required to provide a supporting statement from their supervisor. Applications for bursaries must be submitted at least two weeks in advance of the course date; applications submitted after this time will not be accepted. Retrospective applications cannot be made if courses have already taken place or payment has already been made.

Please click here to make a booking. If you are applying for a subsidised place, select the £60 University of Manchester option on the booking form. For queries about methods@manchester bursaries, contact methods@manchester.ac.uk (please note, you must have a confirmed place on the course before requesting a bursary application form). For any other queries about short courses, please contact cmi-shortcourses@manchester.ac.uk.

Please note: this is not guaranteed and is considered on a case by case basis. Please contact us for more information.

Outline

The course covers basic ideas and concepts in cluster analysis including hierarchical and non-hierarchical clustering methods and their application. Throughout the day, participants will have guided practicals in R using ecological data from the UK Census of population on how to choose and transform variables for cluster analysis, how to perform cluster analysis using different clustering techniques, and how to examine cluster results and cluster validity.

Course objectives

  • Basic ideas and concepts of hierarchical and non-hierarchical cluster analysis
  • Selection of variables for cluster analysis
  • Choice and application of an appropriate clustering technique in R
  • Determine an optimal cluster solution
  • Cluster interpretation
  • Cluster validation

Participants will develop an understanding of clustering methods and procedures, carrying out analysis in R. By the end of the course, they will be able to carry out preliminary analysis to select and transform variables for cluster analysis, choose a clustering method, evaluate and choose cluster solutions, and interpret cluster results.

The course will enable participants to develop practical skills to implement clustering techniques to identify structures or groupings within data, to better understand social, economic and spatial dimensions of populations, and categorise data for further analysis, for example for statistical modelling or in-depth case study research. 

Prerequisites

Participants should have an understanding of basic data analysis techniques including correlation and regression analysis. Familiarity with R is desired but not essential. 

Recommended reading

  • M B. Everitt, S. Landau, M. Leese (2011) Cluster Analysis (5th Edition) Arnold, London
  • D. Vickers, and Rees, P. (2007). Creating the national statistics 2001 output area classification, Journal of the Royal Statistical Society, Series A 170 (2), pp. 379–403
  • Kassambara (2017) Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning, Edition 1, STHDA.

About the instructors

Apply

Participants will develop an understanding of clustering methods and procedures, carrying out analysis in the program R. By the end of the course, they will be able to carry out preliminary analysis to select and transform variables for cluster analysis, choose a clustering method, evaluate and choose cluster solutions, interpret clusters and present cluster analysis results. Hierarchical and non-hierarchical cluster analysis will be applied to 2011 Census local area data to produce an area classification to group areas with similar overall population characteristics into clusters.