Last updated: 2018-09-05
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(20180820)
The command set.seed(20180820)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: 5b425e1
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.DS_Store
Ignored: analysis/assets/
Ignored: data-raw/
Ignored: data/csv/
Ignored: data/raster/
Ignored: data/sf/
Ignored: docs/.DS_Store
Untracked files:
Untracked: .Rbuildignore
Untracked: analysis/mapping.Rmd
Unstaged changes:
Modified: .gitignore
Modified: analysis/_site.yml
Modified: analysis/vector.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 5b425e1 | annakrystalli | 2018-09-05 | workflowr::wflow_publish(c(“analysis/raster.Rmd”)) |
html | c91966e | annakrystalli | 2018-09-05 | Build site. |
Rmd | 9dd6ff7 | annakrystalli | 2018-09-05 | workflowr::wflow_publish(c(“analysis/raster.Rmd”)) |
html | 0eb5925 | annakrystalli | 2018-09-05 | Build site. |
Rmd | a7080b8 | annakrystalli | 2018-09-05 | workflowr::wflow_publish(c(“analysis/raster.Rmd”)) |
html | 88e2cb7 | annakrystalli | 2018-09-05 | Build site. |
html | 42bd442 | annakrystalli | 2018-09-04 | Build site. |
Rmd | 8f5c11c | annakrystalli | 2018-09-04 | workflowr::wflow_publish(“analysis/raster.Rmd”) |
html | f94df5c | annakrystalli | 2018-09-04 | Build site. |
Rmd | ae1fd63 | annakrystalli | 2018-09-04 | workflowr::wflow_publish(c(“analysis/raster.Rmd”)) |
Open a new R Notebook to work in.
File > New File > R Notebook
Name (eg. Rasters
) and save it
Load the libraries we’ll be using for this section of the workshop
library(raster)
library(rasterVis)
library(sf)
library(dplyr)
library(ggplot2)
Gridded data. Each grid cell represented by pixels in the raster. Pixels represent an area of space on the Earth’s surface
3 core metadata elements: - Coordinate Reference System (CRS) - extent - resolution
See “Raster resolution and extent”
The spatial resolution of a raster refers the size of each cell in meters. This size in turn relates to the area on the ground that the pixel represents.
The higher the resolution for the same extent the crisper the image (and the larger the file size)
\(x_{min} + (resolution_{x} \times n_{pixels}_{x})\)
Unlike vector data, the raster data model stores the coordinate of the grid cell only indirectly: There is a less clear distinction between attribute and spatial information in raster data. Say, we are in the 3rd row and the 4th column of a raster matrix. To derive the corresponding coordinate, we have to move from the origin three cells in x-direction and four cells in y-direction with the cell resolution defining the distance for each x- and y-step.
Rasters can be thought of as matrices appended with additional environmental metadata.
myRaster1 <- raster(nrow=4, ncol=4)
myRaster1
class : RasterLayer
dimensions : 4, 4, 16 (nrow, ncol, ncell)
resolution : 90, 45 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
Let’s have a look at it. Note that when creating a raster, if not specified the CRS falls back to the defaults of:
+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
-180, 180, -90, 90 (xmin, xmax, ymin, ymax)
90, 45 (x, y)
Q: What’s been defined?
Let’s give it some values
myRaster1[] <-1:16
plot(myRaster1)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
Bioclimatic variables are derived from the monthly temperature and rainfall values in order to generate more biologically meaningful variables. The bioclimatic variables represent annual trends, seasonality, and extreme or limiting environmental factors
I’ve selected a few of the variables used in the original paper to fit a Species Distribution Model.
The data is in the data/raster/mx-worldclim_30s
folder.
wc_files <- list.files(here::here("data", "raster", "mx-worldclim_30s"),
full.names = T)
wc_files
[1] "/Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_15.tif"
[2] "/Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_4.tif"
[3] "/Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_5.tif"
[4] "/Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_6.tif"
These files are in GeoTIFF format, a public domain metadata standard which allows georeferencing information to be embedded within a TIFF file.
Let’s start with a single raster file, mx.bio_5.tif
which corresponds to bioclimatic variable 5: Max Temperature of Warmest Month.
bio5 <- raster(wc_files[3])
bio5
class : RasterLayer
dimensions : 2181, 3638, 7934478 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.125, -86.80833, 14.54167, 32.71667 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : /Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_5.tif
names : mx.bio_5
values : 54, 427 (min, max)
This creates a RasterLayer
object.
By having a look at the summary of the raster file when we simply print the object, straight away it looks like something funny is going on. It’s showing a range of values between 54 and 427. Now, Mexico can get hot…but not that hot! By checking the documentation for the WorldClim data, we can see that the data is stored as degrees C x 10. This is for storage efficiency (files are much smaller if numbers can be stored as integers) but it means we need to transform the data back to degrees C.
Luckily we can easily manipulate rasters, just like any other matrix in R.
bio5 <- bio5/10
bio5
class : RasterLayer
dimensions : 2181, 3638, 7934478 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.125, -86.80833, 14.54167, 32.71667 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : mx.bio_5
values : 5.4, 42.7 (min, max)
That’s better!
The raster
pkg has native plotting functions which are again, ok for a quick check of the data.
plot(bio5)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
Package rasterVis
offers much nicer options for plotting raster data, including much better colour palletes which are pretty, better represent data, are easier to read by those with colorblindness, and print well in grey scale. by default.
levelplot(bio5)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
For numeric data it plots the distribution of the data along each axis in the plot margins. We can suppress that default behaviour by using argument margin=FALSE
.
levelplot(bio5, margin=FALSE)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
Now this is great for individual layers, but if we have multiple layers to work with, it can be much more efficient to stack them into a rasterStack
.
A RasterStack
is a collection of RasterLayer objects with the same spatial extent and resolution. A RasterStack can be created from RasterLayer
objects, or from raster files, or both.
We can read and stack raster files in one go using function raster::stack
! And this is where the list of file names comes in handy.
st <- stack(wc_files)
st
class : RasterStack
dimensions : 2181, 3638, 7934478, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.125, -86.80833, 14.54167, 32.71667 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
names : mx.bio_15, mx.bio_4, mx.bio_5, mx.bio_6
min values : 10, 199, 54, -85
max values : 140, 8136, 427, 218
We can still extract individual layers using function raster::subset()
.
subset(st, "mx.bio_5")
class : RasterLayer
dimensions : 2181, 3638, 7934478 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.125, -86.80833, 14.54167, 32.71667 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : /Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_5.tif
names : mx.bio_5
values : 54, 427 (min, max)
Because a rasterStack
is effectively a list, we can also subset it as we would any other list in R
st[["mx.bio_5"]]
class : RasterLayer
dimensions : 2181, 3638, 7934478 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.125, -86.80833, 14.54167, 32.71667 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : /Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_5.tif
names : mx.bio_5
values : 54, 427 (min, max)
st[[3]]
class : RasterLayer
dimensions : 2181, 3638, 7934478 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.125, -86.80833, 14.54167, 32.71667 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : /Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/mx-worldclim_30s/mx-bio_5.tif
names : mx.bio_5
values : 54, 427 (min, max)
Note that we are back to having incorrect temperature values. We will deal with the layers that need correcting a bit later so just ignore that for now.
Both the native plot
method and rasterVis::levelplot
can handle rasterStack
s
plot(st)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
levelplot(st)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
For a quick scan of a rasterStack
, plot()
is more useful because levelplot()
function plot all panels on the same scale but there are ways of plotting with separate scales which we will link to later.
Land cover, original data resampled onto a 30 seconds grid sourced from DIVA GIS. DIVA-GIS is a free computer program for mapping and geographic data analysis (a geographic information system (GIS) which also provide free global spatial data.
lc_files <- list.files(here::here("data", "raster", "MEX_msk_cov"),
full.names = T)
lc <- raster(lc_files[1])
lc
class : RasterLayer
dimensions : 2208, 3696, 8160768 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.4, -86.6, 14.4, 32.8 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84
data source : /Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/MEX_msk_cov/MEX_msk_cov.grd
names : MEX_msk_cov
values : 1, 22 (min, max)
Let’s plot this again to have a look at it.
levelplot(lc)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
This raster contains categorical data, so the scales used as well as the inclusion of distributions along the margins do not see appropriate. Such data can be better defined using the rasteVis::ratify
function.
lc <- ratify(lc)
lc
class : RasterLayer
dimensions : 2208, 3696, 8160768 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.4, -86.6, 14.4, 32.8 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84
data source : /Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/MEX_msk_cov/MEX_msk_cov.grd
names : MEX_msk_cov
values : 1, 22 (min, max)
attributes :
ID
from: 1
to : 22
Now we see that rather than a values: 1, 22 (min, max)
we have an attributes:
field containing a table summarising the levels with from:
for the first and to:
for the last entry. The actual levels are stored in what is known as a “Raster Attribute Table” (RAT). This can be accessed through the levels()
function.
levels(lc)
[[1]]
ID
1 1
2 2
3 4
4 6
5 9
6 11
7 12
8 13
9 14
10 15
11 16
12 20
13 22
Let’s try and plot again.
levelplot(lc)
Error in `[.data.frame`(rat, , att): undefined columns selected
This time, plotting fails. This is because there are no descriptions associated with the levels.
We can define this defined with more informative descriptions. As I forgot to save them as .csv
as part of the workshop materials, here is a snippet of code that can be copied and pasted to create a data.frame of factor levels and their associated descriptions.
(or go to http://bit.ly/lc_levels, click on raw and copy the code snippet from there)
lc_levels <- structure(list(level = 1:22, descr = c("Tree Cover, broadleaved, evergreen",
"Tree Cover, broadleaved, deciduous, closed", "Tree Cover, broadleaved, deciduous, open",
"Tree Cover, needle-leaved, evergreen", "Tree Cover, needle-leaved, deciduous",
"Tree Cover, mixed leaf type", "Tree Cover, regularly flooded, fresh water",
"Tree Cover, regularly flooded, saline water", "Mosaic: Tree cover / Other natural vegetation",
"Tree Cover, burnt", "Shrub Cover, closed-open, evergreen", "Shrub Cover, closed-open, deciduous",
"Herbaceous Cover, closed-open", "Sparse Herbaceous or sparse Shrub Cover",
"Regularly flooded Shrub and/or Herbaceous Cover", "Cultivated and managed areas",
"Mosaic: Cropland / Tree Cover / Other natural vegetation", "Mosaic: Cropland / Shrub or Grass Cover",
"Bare Areas", "Water Bodies", "Snow and Ice", "Artificial surfaces and associated areas"
)), class = "data.frame", .Names = c("level", "descr"), row.names = c(NA,
-22L))
rat <- levels(lc)[[1]]
rat <- rat %>% left_join(lc_levels, by = c("ID" = "level"))
levels(lc) <- rat
Let’s have a look at our land cover raster
lc
class : RasterLayer
dimensions : 2208, 3696, 8160768 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -117.4, -86.6, 14.4, 32.8 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84
data source : /Users/Anna/Documents/workflows/workshops/intro-r-gis/data/raster/MEX_msk_cov/MEX_msk_cov.grd
names : MEX_msk_cov
values : 1, 22 (min, max)
attributes :
ID descr
from: 1 Tree Cover, broadleaved, evergreen
to : 22 Artificial surfaces and associated areas
Let’s plot again
levelplot(lc)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
Success!
And seeing as we’re dealing with primarily vegetation, let’s create a new map theme (colour palette) using function rasterVis::rasterTheme
and colour brewer palette Yellow & Greens (“YlGn”).
mapTheme <- rasterTheme(region = rev(brewer.pal(9,"YlGn")))
levelplot(lc, par.settings = mapTheme)
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
The function takes a vector of colours to produce a colour gradient that is then mapped to raster values. It has a numer of in-built colour vectors to choose from and you can even provide your own custom vectors of functions (which is what we would probably want to do in our case to make the colour more reflective of the vegetation type).
rasterLayers
stack(st, lc)
Error in compareRaster(x): different extent
This doesn’t work, notifying us that there is a problem with mismatching extents. We don’t really need the whole extent of data anyways so let’s try croping everything to the same extent, that of the study area bounding box we defined in our vector workflow
So let’s load the molecular data that we have converted to an sf
using function sf::read_sf
and recreate a bounding box.
mol_sf <- read_sf(here::here("data", "sf", "salamander.geojson"))
study_bbox <- mol_sf %>% st_bbox() %>% st_as_sfc()
This bounding box is really tight around our data points. To ensure our raster data contain the locations of all our data points, let’s give this extraction bounding box some space around our points using function sf::st_buffer
.
Looking at the help file for this function using ?st_buffer
gives us information on a whole suite of useful functions to perform geometric operations on simple feature geometry sets.
extract_bbox <- study_bbox %>% st_buffer(dist = 1)
Warning in st_buffer.sfc(., dist = 1): st_buffer does not correctly buffer
longitude/latitude data
extract_bbox
Geometry set for 1 feature
geometry type: POLYGON
dimension: XY
bbox: xmin: -100.8481 ymin: 17.94194 xmax: -96.09056 ymax: 20.63083
epsg (SRID): 4326
proj4string: +proj=longlat +datum=WGS84 +no_defs
To crop a raster we use function raster::crop
which will returns a geographic subset of the raster as specified either by an Extent
object or an object from which an extent object can be extracted/created.
In our case, we’ll use the extract_bbox
sf
we just created. So let’s try and crop lc
first.
crop(lc, extract_bbox)
Error in .local(x, y, ...): Cannot get an Extent object from argument y
Ooops! That throws an error! That’s because of current sf
and raster
compatibility issues. All we need to do though is convert our sf
to an sp
spatial class object which raster
is designed to work with. We can do this with function sf::as_Spatial
.
sp_extract_bbox <- as_Spatial(extract_bbox)
sp_extract_bbox
class : SpatialPolygons
features : 1
extent : -100.8481, -96.09056, 17.94194, 20.63083 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
Let’s try now.
crop(lc, sp_extract_bbox)
class : RasterLayer
dimensions : 323, 571, 184433 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -100.85, -96.09167, 17.94167, 20.63333 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84
data source : in memory
names : MEX_msk_cov
values : 1, 22 (min, max)
attributes :
ID descr
from: 1 Tree Cover, broadleaved, evergreen
to : 22 Artificial surfaces and associated areas
Success! This works.
So let’s stack and crop all in one go:
full_stack <- stack(
crop(lc, sp_extract_bbox),
crop(st, sp_extract_bbox)
)
full_stack
class : RasterStack
dimensions : 323, 571, 184433, 5 (nrow, ncol, ncell, nlayers)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -100.85, -96.09167, 17.94167, 20.63333 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84
names : MEX_msk_cov, mx.bio_15, mx.bio_4, mx.bio_5, mx.bio_6
min values : 1, 41, 655, 54, -83
max values : 22, 117, 3387, 411, 180
Awesome! We’ve now got all our initial raster files in a single rasterStack
🎉. We’re not done though. There are two things we need to address for our final rasterStack
.
mx.bio_5
and mx.bio_6
) for our SDM rather than the extremes.So let’s try and address this by creating a new rasterStack
from the layers in our full_stack
. We use function raster::stack()
again.
env_stack <- stack(
(full_stack[["mx.bio_5"]] - full_stack[["mx.bio_6"]])/10,
full_stack[["mx.bio_4"]],
full_stack[["mx.bio_15"]],
full_stack[["MEX_msk_cov"]])
env_stack
class : RasterStack
dimensions : 323, 571, 184433, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -100.85, -96.09167, 17.94167, 20.63333 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
names : layer, mx.bio_4, mx.bio_15, MEX_msk_cov
min values : 13.7, 655.0, 41.0, 1.0
max values : 27.2, 3387.0, 117.0, 22.0
Let’s give our layers better names. This is easily achieved with function names()
names(env_stack) <- c("temp_range","temp_seasonality",
"prec_seasonality", "land_cover")
env_stack
class : RasterStack
dimensions : 323, 571, 184433, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -100.85, -96.09167, 17.94167, 20.63333 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
names : temp_range, temp_seasonality, prec_seasonality, land_cover
min values : 13.7, 655.0, 41.0, 1.0
max values : 27.2, 3387.0, 117.0, 22.0
rasterLayer
of rough mean temperature.(rough because it would be much better to use more data at higher temporal resolution, eg at least monthly, not extremes).
rasterStack
Let’s try and plot our new environmental stack.
levelplot(env_stack)
Error in .checkLevels(levs[[j]], value[[j]]): new raster attributes (factor values) should be in a data.frame (inside a list)
This doesn’t work now because we are trying to mix displaying factor and numeric data.
We can still extract and plot individual layers though.
levelplot(env_stack, layers = "temp_seasonality")
Version | Author | Date |
---|---|---|
f94df5c | annakrystalli | 2018-09-04 |
For more details on how to plot several rasterLayers
with different legends (including different data types) in the rasterVis
package FAQs. It would also solve the problem we had earlier with plotting multiple layers using the same scale.
A number of drivers are available to write raster data to a number of gridded geospatial file types:
raster::writeFormats() %>% knitr::kable()
name | long_name |
---|---|
raster | R-raster |
SAGA | SAGA GIS |
IDRISI | IDRISI |
IDRISIold | IDRISI (img/doc) |
BIL | Band by Line |
BSQ | Band Sequential |
BIP | Band by Pixel |
ascii | Arc ASCII |
CDF | NetCDF |
big | big.matrix |
ADRG | ARC Digitized Raster Graphics |
BMP | MS Windows Device Independent Bitmap |
BT | VTP .bt (Binary Terrain) 1.3 Format |
CTable2 | CTable2 Datum Grid Shift |
EHdr | ESRI .hdr Labelled |
ELAS | ELAS |
ENVI | ENVI .hdr Labelled |
ERS | ERMapper .ers Labelled |
GPKG | GeoPackage |
GS7BG | Golden Software 7 Binary Grid (.grd) |
GSBG | Golden Software Binary Grid (.grd) |
GTiff | GeoTIFF |
GTX | NOAA Vertical Datum .GTX |
HFA | Erdas Imagine Images (.img) |
IDA | Image Data and Analysis |
ILWIS | ILWIS Raster Map |
INGR | Intergraph Raster |
ISCE | ISCE raster |
ISIS2 | USGS Astrogeology ISIS cube (Version 2) |
KRO | KOLOR Raw |
LAN | Erdas .LAN/.GIS |
Leveller | Leveller heightfield |
MBTiles | MBTiles |
MRF | Meta Raster Format |
netCDF | Network Common Data Format |
NITF | National Imagery Transmission Format |
NTv2 | NTv2 Datum Grid Shift |
PAux | PCI .aux Labelled |
PCIDSK | PCIDSK Database File |
PCRaster | PCRaster Raster File |
Geospatial PDF | |
PNM | Portable Pixmap Format (netpbm) |
RMF | Raster Matrix Format |
ROI_PAC | ROI_PAC raster |
RST | Idrisi Raster A.1 |
SAGA | SAGA GIS Binary Grid (.sdat) |
SGI | SGI Image File Format 1.0 |
Terragen | Terragen heightfield |
rasterStack
So let’s finally save our raster stack as a binary ‘Native’ raster
package .grd
file format using function raster::writeRaster()
. We’ll do that to preserve the layer names in the rasterStack
. It also allows us to combine categorical and numeric layers in one file.
writeRaster(env_stack, filename = here::here("data", "raster", "env_stack.grd"))
However, these files are not compressed. If the size of the files is an issue, we can save each file as an individual GeoTiff file and reimport them all together into a stack later on.
dir.create(here::here("data", "raster", "processed"))
writeRaster(env_stack,
filename=here::here("data", "raster" ,
"processed", "env_stack.tif"),
bylayer = T, suffix = names(env_stack))
The following code lists all the files in the processed
folder, matching only those files that end with .tiff
(ignoring the env_land_cover.tif.aux.xml
file which contains the RAT and would throw an error), reads and stacks them!
stack(list.files(here::here("data", "raster" , "processed"),
pattern = ".tif$",
full.names = T))
class : RasterStack
dimensions : 323, 571, 184433, 4 (nrow, ncol, ncell, nlayers)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -100.85, -96.09167, 17.94167, 20.63333 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +ellps=WGS84 +no_defs
names : env_stack_land_cover, env_stack_prec_seasonality, env_stack_temp_range, env_stack_temp_seasonality
min values : 1.0, 41.0, 13.7, 655.0
max values : 22.0, 117.0, 27.2, 3387.0
We can extract data underlying an sf
from a raster using function raster::extract()
. The output in the case of points is a single value for each point. This is returned as a vector for a single layer or a matrix for multiple layers, as is our case.
Let’s have a look at our data again
mol_sf
Simple feature collection with 15 features and 9 fields
geometry type: POINT
dimension: XY
bbox: xmin: -99.84806 ymin: 18.94194 xmax: -97.09056 ymax: 19.63083
epsg (SRID): 4326
proj4string: +proj=longlat +datum=WGS84 +no_defs
# A tibble: 15 x 10
id locality n mountain_chain region na he ar par
<int> <chr> <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 Nevado de… 12 Nevado de Toluca Centr… 5.44 0.620 4.56 0.350
2 2 Texcalyac… 29 Sierra de las C… Centr… 8.22 0.660 5.14 0.500
3 3 Desierto … 7 Sierra de las C… Centr… 4.44 0.590 4.44 0.180
4 4 Ajusco 8 Sierra de las C… Centr… 4.22 0.490 4.05 0.0200
5 8 Calpan 34 Sierra Nevada Centr… 11.9 0.730 6.48 0.290
6 9 Atzompa 43 Sierra Nevada Centr… 10.3 0.690 5.79 0.0800
7 10 Llano Gra… 15 Sierra Nevada (… Centr… 7.78 0.650 5.80 0.250
8 11 Rio Frio 27 Sierra Nevada (… Centr… 7.56 0.570 4.77 0.130
9 12 Nanacamil… 14 Sierra Nevada (… Centr… 6.22 0.590 4.91 0.100
10 13 MalincheS 8 Malinche Centr… 5.00 0.580 4.69 0.0900
11 14 MalincheW 17 Malinche Centr… 6.67 0.600 4.76 0.0600
12 16 MalincheE 13 Malinche Centr… 6.11 0.560 4.73 0.210
13 17 Texmalaqu… 8 Pico de Orizaba South… 6.00 0.710 5.64 0.910
14 18 Xometla 16 Pico de Orizaba South… 9.11 0.830 6.86 0.490
15 19 Vigas 48 Cofre de Perote North… 11.8 0.660 5.75 1.31
# ... with 1 more variable: geometry <POINT [°]>
Because we want to combine it with our previous data in mol_sf
we’ll pipe the resulting matrix into as.data.frame
so we can easily bind our extracted environmental data to our molecular data.
env_points <- extract(env_stack, as_Spatial(mol_sf)) %>% as.data.frame()
mol_env_sf <- bind_cols(mol_sf, env_points)
mol_env_sf
Simple feature collection with 15 features and 13 fields
geometry type: POINT
dimension: XY
bbox: xmin: -99.84806 ymin: 18.94194 xmax: -97.09056 ymax: 19.63083
epsg (SRID): 4326
proj4string: +proj=longlat +datum=WGS84 +no_defs
# A tibble: 15 x 14
id locality n mountain_chain region na he ar par
<int> <chr> <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 Nevado de… 12 Nevado de Toluca Centr… 5.44 0.620 4.56 0.350
2 2 Texcalyac… 29 Sierra de las C… Centr… 8.22 0.660 5.14 0.500
3 3 Desierto … 7 Sierra de las C… Centr… 4.44 0.590 4.44 0.180
4 4 Ajusco 8 Sierra de las C… Centr… 4.22 0.490 4.05 0.0200
5 8 Calpan 34 Sierra Nevada Centr… 11.9 0.730 6.48 0.290
6 9 Atzompa 43 Sierra Nevada Centr… 10.3 0.690 5.79 0.0800
7 10 Llano Gra… 15 Sierra Nevada (… Centr… 7.78 0.650 5.80 0.250
8 11 Rio Frio 27 Sierra Nevada (… Centr… 7.56 0.570 4.77 0.130
9 12 Nanacamil… 14 Sierra Nevada (… Centr… 6.22 0.590 4.91 0.100
10 13 MalincheS 8 Malinche Centr… 5.00 0.580 4.69 0.0900
11 14 MalincheW 17 Malinche Centr… 6.67 0.600 4.76 0.0600
12 16 MalincheE 13 Malinche Centr… 6.11 0.560 4.73 0.210
13 17 Texmalaqu… 8 Pico de Orizaba South… 6.00 0.710 5.64 0.910
14 18 Xometla 16 Pico de Orizaba South… 9.11 0.830 6.86 0.490
15 19 Vigas 48 Cofre de Perote North… 11.8 0.660 5.75 1.31
# ... with 5 more variables: geometry <POINT [°]>, temp_range <dbl>,
# temp_seasonality <dbl>, prec_seasonality <dbl>, land_cover <dbl>
Our new sf
is now ready to use for species distribution modelling. But we can also start visualising the relationships between our molecular and environmental variables
ggplot(mol_env_sf, aes(x = temp_range, y = na, colour = region)) +
geom_point()
Version | Author | Date |
---|---|---|
c91966e | annakrystalli | 2018-09-05 |
We can also extract and summarise data over an area represented by a polygon using using the raster::extract()
function. If we want a summarising function to be applied to the pixel values returned by the extraction, we can supply it to argument fun
. Let’s calculate the mean temp_range
across the study_bbox
area.
mean_temp_range <- extract(env_stack[["temp_range"]],
as_Spatial(study_bbox),
fun = mean,
na.rm = T)
mean_temp_range
[,1]
[1,] 22.21228
Let’s use this to calculate the deviation of each of our data points from the study box mean we just calculated. We can use dplyr::mutate
to manipulate attribute data in our sf
just like any other data.frame.
mol_env_sf <- mol_env_sf %>%
mutate(dev_temp_range = temp_range - as.vector(mean_temp_range))
mol_env_sf %>% select(locality, dev_temp_range)
Simple feature collection with 15 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: -99.84806 ymin: 18.94194 xmax: -97.09056 ymax: 19.63083
epsg (SRID): 4326
proj4string: +proj=longlat +datum=WGS84 +no_defs
# A tibble: 15 x 3
locality dev_temp_range geometry
<chr> <dbl> <POINT [°]>
1 Nevado de Toluca -3.01 (-99.84806 19.19361)
2 Texcalyacac 0.188 (-99.5 19.12056)
3 Desierto de los Leones -2.11 (-99.30056 19.26667)
4 Ajusco -3.11 (-99.3 19.18278)
5 Calpan -2.81 (-98.59167 19.13139)
6 Atzompa -1.11 (-98.55972 19.18056)
7 Llano Grande -1.41 (-98.72056 19.33889)
8 Rio Frio -2.01 (-98.69472 19.36611)
9 Nanacamilpa 0.288 (-98.59611 19.48028)
10 MalincheS -1.71 (-98.02194 19.18722)
11 MalincheW -0.112 (-98.095 19.25778)
12 MalincheE -0.912 (-97.975 19.23)
13 Texmalaquilla -2.21 (-97.29 18.94194)
14 Xometla -1.31 (-97.19056 18.975)
15 Vigas -3.51 (-97.09056 19.63083)
Let’s save our final sf
now containing the environmental data we extracted
write_sf(mol_env_sf, here::here("data", "sf", "env_salamander.geojson"))
TRUE
) or less than (FALSE
) extraction area mean precipitation seasonality.raster
functions you should know about:http://rspatial.org/spatial/rst/8-rastermanip.html
merge
: merge rasterLayers
trim
: remove outer NA
rows and columnsextend
: expand margins with NA
.projectRaster
: Project the values of a Raster* object to a new Raster* object with another projection (coordinate reference system, (CRS)).sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bindrcpp_0.2.2 ggplot2_3.0.0 dplyr_0.7.6
[4] sf_0.6-3 rasterVis_0.45 latticeExtra_0.6-28
[7] RColorBrewer_1.1-2 lattice_0.20-35 raster_2.6-7
[10] sp_1.2-5
loaded via a namespace (and not attached):
[1] zoo_1.8-3 tidyselect_0.2.4 purrr_0.2.5
[4] colorspace_1.3-2 htmltools_0.3.6 viridisLite_0.3.0
[7] emo_0.0.0.9000 yaml_2.1.19 utf8_1.1.3
[10] rlang_0.2.1 R.oo_1.21.0 e1071_1.6-8
[13] hexbin_1.27.1 pillar_1.2.1 glue_1.2.0.9000
[16] withr_2.1.2 DBI_1.0.0 R.utils_2.6.0
[19] bindr_0.1.1 plyr_1.8.4 stringr_1.3.1
[22] munsell_0.5.0 gtable_0.2.0 workflowr_1.0.1
[25] R.methodsS3_1.7.1 evaluate_0.11 labeling_0.3
[28] knitr_1.20 parallel_3.4.4 class_7.3-14
[31] highr_0.6 Rcpp_0.12.18 backports_1.1.2
[34] scales_1.0.0 classInt_0.1-24 digest_0.6.15
[37] stringi_1.2.4 grid_3.4.4 rprojroot_1.3-2
[40] cli_1.0.0 here_0.1 rgdal_1.3-4
[43] tools_3.4.4 magrittr_1.5 lazyeval_0.2.1
[46] tibble_1.4.2 crayon_1.3.4 whisker_0.3-2
[49] pkgconfig_2.0.2 lubridate_1.7.4 rstudioapi_0.7
[52] assertthat_0.2.0 rmarkdown_1.10 R6_2.2.2
[55] units_0.6-0 git2r_0.21.0 compiler_3.4.4
This reproducible R Markdown analysis was created with workflowr 1.0.1