--- title: "MODISTools" author: "Koen Hufkens" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{MODISTools} %\VignetteEngine{knitr::knitr} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( error = TRUE, collapse = TRUE, comment = "#>" ) # load the library library(MODISTools) library(terra) library(ggplot2) library(dplyr) library(sf) # pre-load data data("arcachon_lai") data("arcachon_lc") ``` The MODISTools package has as goal to facilitate the interface between R and the MODIS Land Product Subset API at the Oak Ridge National Laboratory Distributed Active Archive Center (DAAC). This programmatic interface to the ['MODIS Land Products Subsets' web services](https://modis.ornl.gov/data/modis_webservice.html) allows for easy downloads of 'MODIS' time series (of single pixels or small regions of interest) directly to your R workspace or your computer. Below an example is provided on how to download a MODIS time series as well as list ancillary data. # Listing products / bands / dates In order to assess which products are available, which product bands are provided and which temporal range is covered one has to list these ancillary data. All these options can be queried using the `mt_*()` functions. To list all available products use the `mt_products()` function. ```{r eval = TRUE} products <- mt_products() head(products) ``` Once you have settled on the product you want to use in your analysis you will have to select a band, or multiple bands you want to download for a given location. To list all available bands for a given product use the mt_bands() function. You can use the `mt_bands()` function to list all available bands for a given product. Below I list all bands for the MOD13Q1 vegetation index product. ```{r eval = TRUE} bands <- mt_bands(product = "MOD13Q1") head(bands) ``` > Note the band names (listed in band column) you want to download, this variable will need to be passed in the final download statement. Similarly you can list all available dates (temporal coverage) for a given product and location defined using a latitude and longitude with the `mt_dates()` function. ```{r eval = TRUE} dates <- mt_dates(product = "MOD13Q1", lat = 42, lon = -110) head(dates) ``` # Downloading MODIS time series Once you decide on which data to download using the above functions you can use these parameters to download a time series using the `mt_subset()` function. The below query downloads MOD15A2H based leaf area index (LAI) data for the year 2014 for an area around the Arcachon basin in the south west of France. We will also download land cover data (MCD12Q1, IGBP) at a similar scale. The location is named 'arcachon'. The output is saved to a variables called subset and LC in the R workspace (as defined by the parameter internal = TRUE, when set to FALSE the data is written to file). Keep in mind that this operation might take a while. ```{r eval = FALSE} # download the MODIS land cover (IGBP) and NDVI data # for a region around the French city and basin of Arcachon arcachon_lai <- mt_subset(product = "MOD15A2H", lat = 44.656286, lon = -1.174748, band = "Lai_500m", start = "2004-01-01", end = "2004-12-30", km_lr = 20, km_ab = 20, site_name = "arcachon", internal = TRUE, progress = FALSE ) arcachon_lc <- mt_subset( product = "MCD12Q1", lat = 44.656286, lon = -1.174748, band = "LC_Type1", start = "2004-01-01", end = "2004-3-20", km_lr = 20, km_ab = 20, site_name = "arcachon", internal = TRUE, progress = FALSE ) ``` The output format is a *tidy* data frame, as shown above. When witten to a csv with the parameter `internal = FALSE` this will result in a flat file on disk. ```{r} head(arcachon_lai) head(arcachon_lc) ``` Note that when a a region is defined using km_lr and km_ab multiple pixels might be returned. These are indexed using the `pixel` column in the data frame containing the time series data. The remote sensing values are listed in the `value` column. When no band is specified all bands of a given product are returned, be mindful of the fact that different bands might require different multipliers to represent their true values. When a large selection of locations is needed you might benefit from using the batch download function `mt_batch_subset()`, which provides a wrapper around the `mt_subset()` function in order to speed up large download batches. This function has a similar syntax to `mt_subset()` but requires a data frame defining site names (site_name) and locations (lat / lon) (or a comma delimited file with the same structure) to specify a list of download locations. ```{r eval = TRUE} # create data frame with a site_name, lat and lon column # holding the respective names of sites and their location df <- data.frame("site_name" = paste("test",1:2), stringsAsFactors = FALSE) df$lat <- 40 df$lon <- -110 # an example batch download data frame head(df) ``` ```{r eval = FALSE} # test batch download subsets <- mt_batch_subset(df = df, product = "MOD13Q1", band = "250m_16_days_NDVI", km_lr = 1, km_ab = 1, start = "2004-01-01", end = "2004-12-30", internal = TRUE) ``` # Worked example using LAI values around the bay of Arcachon The below example processes the data downloaded above to look at differences in the seasonal changes in leaf area index (LAI, or the amount of leaves per unit ground area) for the Arcachon bay in south-west France. To do this we merge the land cover and LAI data on a pixel by pixel basis. ```{r} # merge land cover and lai data arcachon <- arcachon_lc %>% rename("lc" = "value") %>% select("lc","pixel") %>% right_join(arcachon_lai, by = "pixel") ``` Then, filter out all non valid values (> 100), only select evergreen and deciduous land cover classes (1 and 5, or, ENF and DBF respectivelly), convert them to more readable labels, and across these land cover classes take the median per acquisition date. ```{r} # create a plot of the data - accounting for the multiplier (scale) component arcachon <- arcachon %>% filter(value <= 100, lc %in% c("1","5")) %>% # retain everything but fill values mutate(lc = ifelse(lc == 1, "ENF","DBF")) %>% group_by(lc, calendar_date) %>% # group by lc and date summarize(doy = as.numeric(format(as.Date(calendar_date)[1],"%j")), lai_mean = median(value * as.double(scale))) ``` Finally, the plot will show you the seasonal time series of LAI for both land cover classes (ENF and DBF). Note the difference in timing and amplitude between both these forest types, where the evergreen (ENF) pixels show lower LAI values and a more gradual seasonal pattern compared to the deciduous trees. ```{r fig.width = 7, fig.height=3} # plot LAI by date and per land cover class ggplot(arcachon, aes(x = doy, y = lai_mean)) + geom_point() + geom_smooth(span = 0.3, method = "loess") + labs(x = "day of year (DOY)", y = "leaf area index (LAI)") + theme_minimal() + facet_wrap(~ lc) ``` # Conversion of corner coordinates Corner coordinates of the pixel area extracted are provided, these can be used to calculate the coverage of the extracted area. Coordinates are provided in the original sinusoidal grid coordinates and first have to be transformed into latitude longitude (for convenience). ```{r } # convert the coordinates lat_lon <- sin_to_ll(arcachon_lc$xllcorner, arcachon_lc$yllcorner) # bind with the original dataframe subset <- cbind(arcachon_lc, lat_lon) head(subset) ``` Together with meta-data regarding cell size, number of columns and rows the bounding box of the extracted data can be calculated. ```{r fig.width = 5, fig.height=5} # convert to bounding box bb <- apply(arcachon_lc, 1, function(x){ mt_bbox(xllcorner = x['xllcorner'], yllcorner = x['yllcorner'], cellsize = x['cellsize'], nrows = x['nrows'], ncols = x['ncols']) }) # plot one bounding box plot(bb[[1]]) # add the location of the queried coordinate within the polygon points(arcachon_lc$longitude[1], arcachon_lc$latitude[1], pch = 20, col = "red") ``` # Conversion to (gridded) raster data Although the package is often used to deal with single pixel locations the provisions to download a small region of interest defined by kilometers left-right (west-east) or top-bottom (north-south) allows you to grab small geographic regions for further analysis. The default tidy dataframe format isn't ideal for visualizing this inherently spatial data. Therefore a helper function `mt_to_raster()` is available to convert the tidy dataframe to a gridded (georeferenced) raster format. Below a small region (20x20km) is downloaded around the seaside town of Arcachon, France for the [MODIS land cover product (MCD12Q1)](https://lpdaac.usgs.gov/products/mcd12q1v006/). The data is converted using `mt_to_raster()` with a reproject parameter set to true to plot latitude and longitude coordinates (instead of the default sinusoidal ones). ```{r fig.width = 5, fig.height=5} # convert to raster, when reproject is TRUE # the data is reprojected to lat / lon if FALSE # the data is shown in its original sinuidal projection LC_r <- mt_to_terra(df = arcachon_lc, reproject = TRUE) # plot the raster data as a map plot(LC_r) ```