GIS_Code/HomeworkCommentary.Rmd at main · MurrayChapman17/GIS_Code · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
title: "HomeworkCommentary"
author: "Murray Chapman"
date: "29/10/2021"
output:
  html_document: default
  pdf_document: default
---

# Week 5 Homework

By Murray Chapman

## Introduction

The following code produces a world map colour coded according to
changes in inequality level change between the years 2010 and 2019
according to the UN Gender Inequality Index

## The code

```{r message=FALSE, warning=FALSE}
library(here) #Load all the necessary packages
library(dplyr)
library(janitor)
library(stringr)
library(sf)
library(tidyverse)
library(tmap)
library(tmaptools)
library(countrycode)
```

Read in the shapefile containing country names and geographical data
"Here" removes the need for me to specify the full path, as it will start from the project folder

```{r, message=FALSE, warning=FALSE}
CountryData <- st_read("World_Countries_(Generalized)") %>%
  clean_names() #Neatens the column names, removing the capitalization
```

Repeat this process for the data with country names and inequality indices
'skip = 5' removes the first five rows, which are all header material
'na = ".." removes na values which are stored as ".." in this data set
This is necessary as we need the values in the index column to be recognized as numeric
remove_empty() removes the blank columns in this data set, making it neater
'quiet = True' stops the program declaring the columns it's removed
'#clean_names also adds an "x" to columns with numeric titles, making later calculations possible

```{r, message=FALSE, warning=FALSE}
InequalityData <- read_csv(here("Gender Inequality Index (GII).csv"),
                           skip = 5, na = "..",
                           locale = locale(encoding = "latin1")) %>%
  remove_empty(which = "cols", quiet = TRUE) %>%
  clean_names() %>%
  slice(1:189) %>%
  #Creates a new column containing an iso code to match columns in CountryData
  mutate(iso_code=countrycode(country, origin = 'country.name', destination = 'iso2c'))
```

The country names from the two dataframes cannot be merged as it stands
This is because InequaliyData has a blank space " " before the country name
This block of code fixes this problem
Because of countrycode, this section now isn't strictly necessary

```{r, message=FALSE, warning=FALSE}
CountryList <- dplyr::select(InequalityData, country) #Extracts countries as a list
CountryListTrimmed <- as.list(trimws(CountryList$country, "l")) #Removes the blank spaces
#Adds this fixed list back into the dataframe
CleanInequalityData <- mutate(InequalityData, CountryListTrimmed)
```

This block merges the two dataframes together using their common country names columns
The code also produces a new column with the change of index from columns "x2010" and "x2019"

```{r, message=FALSE, warning=FALSE}
JoinedDataFrame <- merge(CountryData, CleanInequalityData,
                         #The titles of the columns with the common value names
                         by.x = "iso", by.y = "iso_code") %>%
  #Creates the new comparison column
  mutate(., InequalityDifference2010s = x2010 - x2019) %>%
  #Reduces the number of columns to just the important ones we want
  select(country.x, iso, geometry, InequalityDifference2010s)
```

This calculates and prints a mean value for the change in inequality index across the world
"na.rm = TRUE" removes the na values, which prevents "na" from being returned as our output

```{r, message=FALSE, warnings=FALSE}
MeanChange <- mean(JoinedDataFrame$InequalityDifference2010s, na.rm = TRUE)
print(MeanChange)
```

Plot the map with the values colourising the countries

```{r, message=FALSE, warning=FALSE}
tm_shape(JoinedDataFrame) +
  tm_polygons(
    col = "InequalityDifference2010s",
    palette="RdYlGn", #Red, Yellow, Green Pallette
    style="pretty", #Pretty is one of the colouring styles
    n=8, #Sets eight colour categories
    midpoint = 0.1) #The value for the bland colour between yellow and green
```