-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathHomeworkCommentary.Rmd
More file actions
104 lines (86 loc) · 3.84 KB
/
HomeworkCommentary.Rmd
File metadata and controls
104 lines (86 loc) · 3.84 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
title: "HomeworkCommentary"
author: "Murray Chapman"
date: "29/10/2021"
output:
html_document: default
pdf_document: default
---
# Week 5 Homework
By Murray Chapman
## Introduction
The following code produces a world map colour coded according to
changes in inequality level change between the years 2010 and 2019
according to the UN Gender Inequality Index
## The code
```{r message=FALSE, warning=FALSE}
library(here) #Load all the necessary packages
library(dplyr)
library(janitor)
library(stringr)
library(sf)
library(tidyverse)
library(tmap)
library(tmaptools)
library(countrycode)
```
Read in the shapefile containing country names and geographical data
"Here" removes the need for me to specify the full path, as it will start from the project folder
```{r, message=FALSE, warning=FALSE}
CountryData <- st_read("World_Countries_(Generalized)") %>%
clean_names() #Neatens the column names, removing the capitalization
```
Repeat this process for the data with country names and inequality indices
'skip = 5' removes the first five rows, which are all header material
'na = ".." removes na values which are stored as ".." in this data set
This is necessary as we need the values in the index column to be recognized as numeric
remove_empty() removes the blank columns in this data set, making it neater
'quiet = True' stops the program declaring the columns it's removed
'#clean_names also adds an "x" to columns with numeric titles, making later calculations possible
```{r, message=FALSE, warning=FALSE}
InequalityData <- read_csv(here("Gender Inequality Index (GII).csv"),
skip = 5, na = "..",
locale = locale(encoding = "latin1")) %>%
remove_empty(which = "cols", quiet = TRUE) %>%
clean_names() %>%
slice(1:189) %>%
#Creates a new column containing an iso code to match columns in CountryData
mutate(iso_code=countrycode(country, origin = 'country.name', destination = 'iso2c'))
```
The country names from the two dataframes cannot be merged as it stands
This is because InequaliyData has a blank space " " before the country name
This block of code fixes this problem
Because of countrycode, this section now isn't strictly necessary
```{r, message=FALSE, warning=FALSE}
CountryList <- dplyr::select(InequalityData, country) #Extracts countries as a list
CountryListTrimmed <- as.list(trimws(CountryList$country, "l")) #Removes the blank spaces
#Adds this fixed list back into the dataframe
CleanInequalityData <- mutate(InequalityData, CountryListTrimmed)
```
This block merges the two dataframes together using their common country names columns
The code also produces a new column with the change of index from columns "x2010" and "x2019"
```{r, message=FALSE, warning=FALSE}
JoinedDataFrame <- merge(CountryData, CleanInequalityData,
#The titles of the columns with the common value names
by.x = "iso", by.y = "iso_code") %>%
#Creates the new comparison column
mutate(., InequalityDifference2010s = x2010 - x2019) %>%
#Reduces the number of columns to just the important ones we want
select(country.x, iso, geometry, InequalityDifference2010s)
```
This calculates and prints a mean value for the change in inequality index across the world
"na.rm = TRUE" removes the na values, which prevents "na" from being returned as our output
```{r, message=FALSE, warnings=FALSE}
MeanChange <- mean(JoinedDataFrame$InequalityDifference2010s, na.rm = TRUE)
print(MeanChange)
```
Plot the map with the values colourising the countries
```{r, message=FALSE, warning=FALSE}
tm_shape(JoinedDataFrame) +
tm_polygons(
col = "InequalityDifference2010s",
palette="RdYlGn", #Red, Yellow, Green Pallette
style="pretty", #Pretty is one of the colouring styles
n=8, #Sets eight colour categories
midpoint = 0.1) #The value for the bland colour between yellow and green
```