9 Environmental data
9.1 Bowerbird/blueant
Very commonly, we want to know about the environmental conditions at our points of interest. For the remote and vast Southern Ocean these data typically come from satellite or model sources. Some data centres provide extraction tools that will pull out a subset of data to suit your requirements, but often it makes more sense to cache entire data collections locally first and then work with them from there.
bowerbird provides a framework for downloading data files to a local collection, and keeping it up to date. The companion blueant package provides a suite of definitions for Southern Ocean and Antarctic data sources that can be used with bowerbird
. It encompasses data such as sea ice, bathymetry and land topography, oceanography, and atmospheric reanalysis and weather predictions, from providers such as NASA, NOAA, Copernicus, NSIDC, and Ifremer.
Why might you want to maintain local copies of entire data sets, instead of just fetching subsets of data from providers as needed?
- many analyses make use of data from a variety of providers (in which case there may not be dynamic extraction tools for all of them),
- analyses might need to crunch through a whole collection of data in order to calculate appropriate statistics (temperature anomalies with respect to a long-term mean, for example),
- different parts of the same data set are used in different analyses, in which case making one copy of the whole thing may be easier to manage than having different subsets for different projects,
- a common suite of data are routinely used by a local research community, in which case it makes more sense to keep a local copy for everyone to use, rather than multiple copies being downloaded by different individuals.
In these cases, maintaining local copies of a range of data from third-party providers can be extremely beneficial, especially if that collection is hosted with a fast connection to local compute resources (virtual machines or high-performance computational facilities).
Install from GitHub:
remotes::install_github("AustralianAntarcticDivision/blueant")
And load the package before use.
library(blueant)
9.1.1 Available data sets
First, we can see the available data sets via the sources
function.
srcs <- blueant::sources()
## the names of the first few
head(srcs$name)
## [1] "NSIDC SMMR-SSM/I Nasateam sea ice concentration"
## [2] "NSIDC SMMR-SSM/I Nasateam near-real-time sea ice concentration"
## [3] "NSIDC passive microwave supporting files"
## [4] "Nimbus Ice Edge Points from Nimbus Visible Imagery"
## [5] "Artist AMSR-E sea ice concentration"
## [6] "Artist AMSR-E supporting files"
## the full details of the first one
srcs[1, ]
## # A tibble: 1 x 16
## id name
## <chr> <chr>
## 1 10.5067/8GQ8LZQVL0VL NSIDC SMMR-SSM/I Nasateam sea ice concentration
## description
## <chr>
## 1 "Passive microwave estimates of sea ice concentration at 25km spatial re~
## doc_url source_url
## <chr> <list>
## 1 http://nsidc.org/data/nsidc-0051.html <chr [1]>
## citation
## <chr>
## 1 Cavalieri, D. J., C. L. Parkinson, P. Gloersen, and H. Zwally. 1996, upd~
## license
## <chr>
## 1 Please cite, see http://nsidc.org/about/use_copyright.html
## comment
## <chr>
## 1 This data source may migrate to https access in the future, requiring an~
## method postprocess authentication_note user password
## <list> <list> <chr> <chr> <chr>
## 1 <named list [5]> <list [0]> <NA> <NA> <NA>
## access_function data_group collection_size
## <chr> <chr> <dbl>
## 1 raadtools::readice Sea ice 10
9.1.2 Usage
Choose a directory into which to download the data. Usually this would be a persistent directory on your machine so that data sets downloaded in one session would remain available for use in later sessions, and not need re-downloading. A persistent directory could be something like c:\data\
(on Windows), or you could use the rappdirs
package (the user_cache_dir
function) to suggest a suitable directory (cross-platform).
Here we’ll use the c:/data/cache
directory:
my_data_dir <- "/data/cache"
Select the data source that we want:
data_source <- sources("Southern Ocean marine environmental data")
Note that it’s a good idea to check the dataset size before downloading it, as some are quite large! (Though if you are running the download interactively, it will ask you before downloading a large data set).
data_source$collection_size ## size in GB
## [1] 0.1
And fetch the data:
result <- bb_get(data_source, local_file_root = my_data_dir, verbose = TRUE)
##
## Tue Oct 22 13:15:53 2019
## Synchronizing dataset: Southern Ocean marine environmental data
## --------------------------------------------------------------------------------------------
##
## this dataset path is: c:\data\cache/services.aad.gov.au/public/datasets/science/environmental_layers
## Locating credentials
## Checking for credentials in user-supplied values
## Checking for credentials in Environment Variables
## Searching for credentials file(s)
## No user-supplied credentials, environment variables, instance metadata, or credentials file found!
## Using user-supplied value for AWS Region ('services')
## Non-AWS base URL requested.
## S3 Request URL: http://services.aad.gov.au/public/
## Executing request without AWS credentials
## Parsing AWS API response
## Success: (200) OK
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/POC_2005_2012_ampli.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/POC_2005_2012_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/POC_2005_2012_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/POC_2005_2012_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/POC_2005_2012_sd.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/chla_ampli_alltime_2005_2012.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/chla_max_alltime_2005_2012.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/chla_mean_alltime_2005_2012.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/chla_min_alltime_2005_2012.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/chla_sd_alltime_2005_2012.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/depth.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/distance_antarctica.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/distance_canyon.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/distance_max_ice_edge.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/distance_shelf.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_max_chl_2005_2012_ampli.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_max_chl_2005_2012_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_max_chl_2005_2012_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_max_chl_2005_2012_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_max_sali_2005_2012_nb.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_max_temp_2005_2012_nb.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_min_chl_2005_2012_ampli.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_min_chl_2005_2012_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_min_chl_2005_2012_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_min_chl_2005_2012_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_min_oxy_1955_2012_nb.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_min_sali_2005_2012_nb.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/extreme_event_min_temp_2005_2012_nb.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/geomorphology.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_cover_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_cover_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_cover_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_cover_range.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_thickness_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_thickness_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_thickness_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/ice_thickness_range.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/metadata_details2.csv ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/mixed_layer_depth.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/roughness.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_current_speed.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_oxy_19552012_ampli.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_oxy_19552012_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_oxy_19552012_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_oxy_19552012_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_oxy_19552012_sd.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_sali_2005_2012_ampli.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_sali_2005_2012_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_sali_2005_2012_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_sali_2005_2012_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_sali_2005_2012_sd.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_temp_2005_2012_ampli.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_temp_2005_2012_max.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_temp_2005_2012_mean.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_temp_2005_2012_min.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seafloor_temp_2005_2012_sd.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/seasurface_current_speed.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/sediments.nc ... done.
## downloading file 1 of 1: http://services.aad.gov.au/public/datasets/science/environmental_layers/slope.nc ... done.
##
## Tue Oct 22 13:16:29 2019 dataset synchronization complete: Southern Ocean marine environmental data
Now we have a local copy of our data. The sync can be run daily so that the local collection is always up to date - it will only download new files, or files that have changed since the last download. For more information on bowerbird
, see the package vignette.
The result
object holds information about the data that we downloaded:
result
## # A tibble: 1 x 5
## name id
## <chr> <chr>
## 1 Southern Ocean marine environmental data 10.26179/5b8f30e30d4f3
## source_url status files
## <chr> <lgl> <list>
## 1 <NA> TRUE <tibble [59 x 3]>
The result$files
element tells us about the files:
head(result$files[[1]])
## # A tibble: 6 x 3
## url
## <chr>
## 1 http://services.aad.gov.au/public/datasets/science/environmental_layers/~
## 2 http://services.aad.gov.au/public/datasets/science/environmental_layers/~
## 3 http://services.aad.gov.au/public/datasets/science/environmental_layers/~
## 4 http://services.aad.gov.au/public/datasets/science/environmental_layers/~
## 5 http://services.aad.gov.au/public/datasets/science/environmental_layers/~
## 6 http://services.aad.gov.au/public/datasets/science/environmental_layers/~
## file
## <chr>
## 1 "c:\\data\\cache\\services.aad.gov.au\\public\\datasets\\science\\enviro~
## 2 "c:\\data\\cache\\services.aad.gov.au\\public\\datasets\\science\\enviro~
## 3 "c:\\data\\cache\\services.aad.gov.au\\public\\datasets\\science\\enviro~
## 4 "c:\\data\\cache\\services.aad.gov.au\\public\\datasets\\science\\enviro~
## 5 "c:\\data\\cache\\services.aad.gov.au\\public\\datasets\\science\\enviro~
## 6 "c:\\data\\cache\\services.aad.gov.au\\public\\datasets\\science\\enviro~
## note
## <chr>
## 1 downloaded
## 2 downloaded
## 3 downloaded
## 4 downloaded
## 5 downloaded
## 6 downloaded
These particular files are netCDF, and so could be read using e.g. the raster
or ncdf4
packages. However, different data from different providers will be different in terms of grids, resolutions, projections, variable-naming conventions, and other facets, which tends to complicate these operations. In the next section we’ll look at the raadtools
package, which provides a set of tools for doing common operations on these types of data.
9.2 RAADtools
The raadtools
package provides a consistent interface to a range of environmental and similar data, and tools for working with them. It is designed to work data with collections maintained by the bowerbird
/blueant
packages, and builds on R’s existing ecosystem of packages for working with spatial, raster, and multidimensional data.
Here we’ll use two different environmental data sets: sea ice and water depth. Water depth does not change with time but sea ice is provided at daily time resolution.
First download daily sea ice data (from 2013 only), and the ETOPO2 bathymetric data set. ETOPO2 is somewhat dated and low resolution compared to more recent data, but will do as a small dataset for demo purposes. This may take a few minutes, depending on your connection speed:
src <- bind_rows(
sources("NSIDC SMMR-SSM/I Nasateam sea ice concentration", hemisphere = "south", time_resolutions = "day",
years = 2013),
sources("ETOPO2 bathymetry"))
result <- bb_get(src, local_file_root = my_data_dir, clobber = 0, verbose = TRUE, confirm = NULL)
##
## Tue Oct 22 13:16:29 2019
## Synchronizing dataset: NSIDC SMMR-SSM/I Nasateam sea ice concentration
##
## [... output truncated]
Now load the raadtools
package and tell it where our data collection has been stored:
library(raadtools)
set_data_roots(my_data_dir)
Let’s say that we have some points of interest in the Southern Ocean — perhaps a ship track, or some stations where we took marine samples, or as we’ll use here, the track of an elephant seal as it moves from the Kerguelen Islands to Antarctica and back again (Data from IMOS 2018[^1], provided as part of the SOmap
package).
data("SOmap_data", package = "SOmap")
ele <- SOmap_data$mirounga_leonina %>% dplyr::filter(id == "ct96-05-13")
Define our spatial region of interest and extract the bathymetry data from this region, using the ETOPO2 files we just downloaded:
roi <- round(c(range(ele$lon), range(ele$lat)) + c(-2, 2, -2, 2))
bx <- readtopo("etopo2", xylim = roi)
And now we can make a simple plot of our our track superimposed on the bathymetry:
plot(bx)
lines(ele$lon, ele$lat)
The real power of raadtools
comes from its extraction functions. We can extract the depth values along our track using the raadtools::extract()
function. We pass it the data-reader function to use (readtopo
), the data to apply it to (ele[, c("lon", "lat")]
), and any other options to pass to the reader function (in this case, specifying the topographic data source topo = "etopo2"
):
ele$depth <- raadtools::extract(readtopo, ele[, c("lon", "lat")], topo = "etopo2")
Plot the histogram of depth values, showing that most of the track points are located in relatively shallow waters:
with(ele, hist(depth, breaks = 20))
This type of extraction will also work with time-varying data — for example, we can extract the sea-ice conditions along our track, based on each track point’s location and time:
ele$ice <- raadtools::extract(readice, ele[, c("lon", "lat", "date")])
## points outside the ice grid will have missing ice values, so fill them with zeros
ele$ice[is.na(ele$ice)] <- 0
with(ele, plot(date, ice, type = "l"))
9.3 Other useful packages
the PolarWatch project aims to enable data discovery and broader use of high-latitude ocean remote sensing data sets. The dedicated ERDDAP server (https://polarwatch.noaa.gov/erddap) is accessible to R users with rerddap.
rsoi downloads the most up to date Southern Oscillation Index, Oceanic Nino Index, and North Pacific Gyre Oscillation data.
satellite reflectance data are a common basis for estimating chlorophyll-a and other phytoplankton parameters at ocean-basin scales. Global products are widely available; however, Southern-Ocean specific algorithms are likely to provide better estimates in these regions. croc implements the Johnson et al. (2013) Southern Ocean algorithm.
more broadly, oce provides a wide range of tools for reading, processing, and displaying oceanographic data, including measurements from Argo floats and CTD casts, sectional data, sea-level time series, and coastline and topographic data.
fda.oce provides functional data analysis of oceanographic profiles for front detection, water mass identification, unsupervised or supervised classification, model comparison, data calibration, and more.
distancetocoast provides “distance to coastline” data for longitude and latitude coordinates.
geodist for very fast calculation of geodesic distances.