Libraries for current session
library(XML)
library(rvest)
library(tidyverse)
library(sf)
library(gplots)
library(lubridate)
library(ggspatial)
library(ggmap)
library(gstat)
library(knitr)
library(ggthemes)
library(tmap)
library(spatstat)
library(rgeos)
In this session we look how to interpolate the data spatially. Very often data is collected in discrete locations but the phenomena itself is continuous (air temperature, air pressure etc). This means that visualizing data only for observation sites may not be enough to get full picture about the spatial patterns.
Download latest weather observation data from Estonina
Weather Service.
In previous sessions you have seen how to download data in various
various formats (csv, Excel, shp,
sdmx). One very common format is XML. From
here you can read about the reasons: Why should I use
XML?
XML format is not human-readable
format. It is compromise between human-readability and
machine-readability. And skills to use this format may be the first step
from “ordinary people” to data scientist.
In R the import of XML is simple. Basically you don’t have to know anything about XML structure. weather data can be downloaded with 4 lines of code:
weatherDownTime <- Sys.time()
weather <- download.file("http://www.ilmateenistus.ee/ilma_andmed/xml/observations.php", "weather.xml")
xmlfile <- xmlParse("weather.xml")
# convert XML to data frame:
weather <- xmlToDataFrame(xmlfile)
# check the result:
glimpse(weather)
## Rows: 155
## Columns: 17
## $ name <chr> "Kuressaare linn", "Tallinn-Harku", "Pakri", "Kunda"…
## $ wmocode <chr> "", "26038", "26029", "26045", "26046", "26058", "26…
## $ longitude <chr> "22.48944444411111", "24.602891666624284", "24.04008…
## $ latitude <chr> "58.26416666666667", "59.398122222355134", "59.38950…
## $ phenomenon <chr> "", "Overcast", "", "", "", "", "", "", "", "Overcas…
## $ visibility <chr> "", "35.0", "31.0", "45.0", "20.0", "20.0", "20.0", …
## $ precipitations <chr> "", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ airpressure <chr> "", "1007.3", "1006.5", "1008.8", "1009.7", "1010.1"…
## $ relativehumidity <chr> "81", "81", "84", "81", "87", "84", "89", "81", "83"…
## $ airtemperature <chr> "7.6", "7.7", "7.7", "7.1", "5.9", "6.3", "5.3", "5.…
## $ winddirection <chr> "", "187", "169", "199", "197", "188", "209", "", "1…
## $ windspeed <chr> "", "4.9", "4.5", "4.9", "4.7", "5.2", "5.6", "", "3…
## $ windspeedmax <chr> "", "7.7", "8.6", "7.1", "6.9", "7.8", "7.9", "", "6…
## $ waterlevel <chr> "", "", "", "", "", "", "", "", "64", "", "", "", ""…
## $ waterlevel_eh2000 <chr> "", "", "", "37", "", "", "", "", "", "", "", "", ""…
## $ watertemperature <chr> "", "", "", "7.7", "", "", "", "", "", "", "", "", "…
## $ uvindex <chr> "", "0.0", "", "", "", "", "", "", "", "", "", "", "…
Tidy up the table:
# convert columns from character to numeric:
weather <- weather %>%
mutate(longitude = as.numeric(as.character(longitude)),
latitude = as.numeric(as.character(latitude)),
airtemperature = as.numeric(as.character(airtemperature)))
glimpse(weather)
## Rows: 155
## Columns: 17
## $ name <chr> "Kuressaare linn", "Tallinn-Harku", "Pakri", "Kunda"…
## $ wmocode <chr> "", "26038", "26029", "26045", "26046", "26058", "26…
## $ longitude <dbl> 22.48944, 24.60289, 24.04008, 26.54140, 27.39827, 28…
## $ latitude <dbl> 58.26417, 59.39812, 59.38950, 59.52141, 59.32902, 59…
## $ phenomenon <chr> "", "Overcast", "", "", "", "", "", "", "", "Overcas…
## $ visibility <chr> "", "35.0", "31.0", "45.0", "20.0", "20.0", "20.0", …
## $ precipitations <chr> "", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ airpressure <chr> "", "1007.3", "1006.5", "1008.8", "1009.7", "1010.1"…
## $ relativehumidity <chr> "81", "81", "84", "81", "87", "84", "89", "81", "83"…
## $ airtemperature <dbl> 7.6, 7.7, 7.7, 7.1, 5.9, 6.3, 5.3, 5.9, 5.7, 5.7, 4.…
## $ winddirection <chr> "", "187", "169", "199", "197", "188", "209", "", "1…
## $ windspeed <chr> "", "4.9", "4.5", "4.9", "4.7", "5.2", "5.6", "", "3…
## $ windspeedmax <chr> "", "7.7", "8.6", "7.1", "6.9", "7.8", "7.9", "", "6…
## $ waterlevel <chr> "", "", "", "", "", "", "", "", "64", "", "", "", ""…
## $ waterlevel_eh2000 <chr> "", "", "", "37", "", "", "", "", "", "", "", "", ""…
## $ watertemperature <chr> "", "", "", "7.7", "", "", "", "", "", "", "", "", "…
## $ uvindex <chr> "", "0.0", "", "", "", "", "", "", "", "", "", "", "…
Map it:
ggplot()+
geom_point(data = weather, aes(x = longitude, y=latitude, colour = airtemperature))+
scale_color_gradientn(colours = topo.colors(20))+
labs(title = "Air temperature in Estonia", subtitle = weatherDownTime)+
coord_fixed()