Introductory Data Science using R
The aims of this short course are the following:
- To become familiar with the data science framework.
- To become familiar with the
R
programming language and the Integrated Development Environment (IDE) RStudio. - To be able to import, summarise and visualise data in RStudio.
- To complete and report on an exploratory data analysis of the Kiva.org global loans dataset.
The course is delivered in the form of lecture-style presentation, with hands-on and interactive R
sessions, so please bring your laptops with you (and have RStudio pre-installed).
Four 2-hour lectures are planned (see below for details of each lecture).
There is one assignment for this course, and there are mini exercises at the end of lectures.
This page will be updated as and when the material becomes ready.
Please download R and RStudio Desktop before attending the lectures.
Lectures
Lecture 1: Introduction to the Data Science Framework
Resources:
- Machine Learning and Data Science talk by Neil Lawrence
- R for Data Science
- Data Science Wiki page
- NEW Inference vs Prediction @ Data Science Blog
- NEW Inference vs Prediction @ Stack Exchange
- NEW Bias-Variance tradeoff @ Towards Data Science
- NEW Bias-Variance tradeoff @ Scott Fortmann
Lecture 2: Introduction to R
Resources:
Lecture 3: Data Science with R Pt. 1
Walk through of Chapters 2–3 of R for Data Science book.
Lecture 4: Data Science with R Pt. 2
Walk through of Chapters 4–8 of R for Data Science book.
Assignment
CSTRAD July 2019 class leaders:
- Team A: Vincent
- Team B: Wen Jei
Due on 2nd August 2019 12.00pm.