Skip to main content

Getting the data

For this tutorial we'll use data on life expectancy and on economic factors in 193 countries globally, in the period 2000-2015. The underlying data is from the WHO Global Health Observatory and the United Nations, but the version we're using here was downloaded from kaggle.com.

To get started, start an R session and let's load the data:

data = readr::read_csv(
"https://www.chg.ox.ac.uk/bioinformatics/training/gms/data/life_expectancy_data.csv"
)

Also don't forget to load our favourite libraries:

library( dplyr )
library( ggplot2 )

(or just library( tidyverse ).)

Before we get started, have a look at the data frame briefly using your R skills to make sure you know what's in there.

Question

How many columns are there? How many rows? What are the column names? What do the first few rows of data look like?