--- title: "Database Tools" output: rmarkdown::html_vignette description: > Explore the various database utility functions available within FIESTA. vignette: > %\VignetteIndexEntry{Database Tools} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, include = F} library(knitr) knitr::opts_chunk$set(message = F, warning = F, eval = F) ``` ```{r, include=FALSE} # Sets up output folding hooks = knitr::knit_hooks$get() hook_foldable = function(type) { force(type) function(x, options) { res = hooks[[type]](x, options) if (isFALSE(options[[paste0("fold.", type)]])) return(res) paste0( "
", type, "\n\n", res, "\n\n
" ) } } knitr::knit_hooks$set( output = hook_foldable("output"), plot = hook_foldable("plot") ) ``` ```{r, echo=-1} data.table::setDTthreads(2) ``` ### Overview of FIESTA Database (DB) tools FIESTA's DB tools extract data from FIA's online publicly-available, comma-delimited files (\*.csv or \*.zip). FIA's CSV files are available by state from the FIA DataMart at the following link: https://apps.fs.usda.gov/fia/datamart/datamart.html. Because of FIA's confidentiality agreement to protect the privacy of landowners, as well as protecting the scientific integrity of FIA's sample design, the exact coordinates of the sample plot locations are not included in the public data. If the exact coordinates are necessary for your analysis, contact FIA's Spatial Data Services (https://research.fs.usda.gov/programs/fia/sds). ### Objective of tutorial The objective of this tutorial is to demonstrate the use of FIESTA's DB tools for accessing FIA data. These tools extract data from FIA Datamart using FIA's standard evaluations as well as customized evaluations. An FIA Evaluation is a group of plots within the FIA database that is used for population estimates. An FIA Evaluation represents different inventory spans of data with different stratification and area adjustments for nonreponse. Each Evaluation is determined by the type of estimation (evalType) including: area and tree estimates, growth and mortality estimates, and area change estimates (evalType). These plots are identified by an evalid, which is a unique identifier in the format of a 2-digit State code, a 2-digit year code, and a 2-digit evaluation type code. For example, EVALID '491601' represents the Utah 2016 evaluation for current area estimates. FUNCTION | DESCRIPTION -------------- | --------------------------------------------------------------- [DBgetCSV()](#DBgetCSV) | Downloads comma-delimited file (.csv) or downloads and extracts a compressed csv file (.zip) from FIA's online DataMart. [DBqryCSV()](#DBqryCSV) | Extracts and queries data from FIA's online DataMart, either CSV or ZIP files. [DBgetEvalid()](#DBgetEvalid) | Gets evalid for identifying an estimation group of plots for state or checks evalid. [DBgetXY()](#DBgetXY) | Extracts XY data from FIA database. [DBgetPlots()](#DBgetPlots) | Extracts inventory plot data from FIA database. [DBgetStrata()](#DBgetStrata) | Extracts strata information and total acres by estimation unit from FIA database, including plot-level assignment and a data frame with strata weights by estimation unit. ### Set up First, you'll need to load the `FIESTA` library: ```{r, warning = F, message = F} library(FIESTA) ``` Next, you'll need to set up an "outfolder". This is just a file path to a folder where you'd like `FIESTA` to send your data output. For this vignette, we have saved our outfolder file path as the `outfolder` object in a temporary directory. We also set a few default options preferred for this vignette. ```{r} outfolder <- tempdir() ``` ## DB Examples The following examples show how to extract data from FIA's publicly-available, online DataMart. Data can be returned as R objects or exported to CSV (.csv) files or a SQLite (.sqlite) database. The zip files are extracted on-the-fly from the online website. Web server connections will affect download speeds. We show examples for the following functions: The following examples extract data from FIA's online DataMart (https://apps.fs.usda.gov/fia/datamart/datamart.html). Note that while `datsource = 'datamart'` is utilized for these examples, `datsource` can be set to 'sqlite' with `datsource_dsn` set to the local file path for the FIADB file to achieve the same results. ### `DBgetCSV()` The `DBgetCSV` function extracts data from FIA's publicly-available, online DataMart CSV/ZIP files. The zip files are extracted on-the-fly from the online website. Web server connections will affect download speeds. #### Example 1: Extract PLOT data for Wyoming and Utah
View Example #### DBgetCSV() ```{r} ## Get plot table for Wyoming WYplots <- DBgetCSV("PLOT", "Wyoming") dim(WYplots) ## Get plot table for Wyoming and Utah WYUTplots <- DBgetCSV(DBtable = "PLOT", states = c("Wyoming", "Utah")) table(WYUTplots$STATECD) ## Get survey table for Wyoming WYsurvey <- DBgetCSV("SURVEY", "Wyoming") WYsurvey ```
### `DBqryCSV()` The `DBqryCSV` function queries a table from FIA's online publicly-available DataMart. The tables in the query must be specified in the sqltables parameter. #### Example: Multiple Uses
View Example #### DBqryCSV() ```{r} # Get number of plots by inventory year for the state of Wyoming sql1 <- "SELECT INVYR, COUNT(*) AS NBRPLOTS FROM PLOT WHERE statecd = 56 GROUP BY INVYR" nplots1 <- DBqryCSV(sql = sql1, states = "Wyoming", sqltables = "PLOT") head(nplots1) # Get number of plots by inventory year for Vermont and New Hampshire sql2 <- "SELECT STATECD, INVYR, COUNT(*) NBRPLOTS FROM PLOT WHERE statecd IN(50,33) GROUP BY STATECD, INVYR" nplots2 <- DBqryCSV(sql = sql2, states = c("Vermont", "New Hampshire"), sqltables = "PLOT") head(nplots2) # Get number of plots by inventory year for Iowa (stcd=19) that have silver maple (SPCD=317) sql3 <- "SELECT p.STATECD, p.INVYR, COUNT(*) NBRPLOTS FROM PLOT p JOIN TREE t ON p.CN = t.PLT_CN WHERE p.statecd = 19 AND t.SPCD = 317 GROUP BY p.STATECD, p.INVYR" nplots3 <- DBqryCSV(sql = sql3, states = "IOWA", sqltables = c("PLOT", "TREE")) head(nplots3) ```
### `DBgetEvalid()` The `DBgetEvalid` function gets information for FIA Evaluations. #### Example 1: Get most current evalid and inventory years for Wyoming
View Example ```{r} WYeval <- DBgetEvalid(states = "Wyoming", evalCur = TRUE) names(WYeval) WYeval$evalidlist WYeval$invyrs WYeval$invyrtab WYeval$invtype ```
#### Example 2: Get most current evaluations for New York for VOL and GRM evalTypes
View Example ```{r} NYeval <- DBgetEvalid(states = c("New York"), evalType = c("VOL", "GRM"), evalCur = TRUE) names(NYeval) NYeval$evalidlist NYeval$evalTypelist ```
###
`DBgetXY()` The `DBgetXY` function queries XY public coordinate data from FIA' online publicly-available DataMart or SQLite database. #### Example1: Get xy data for the state of Wyoming for the most current evaluation
View Example #### DBgetXY() ```{r} xydat1 <- DBgetXY(states = "Wyoming", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE)) names(xydat1) head(xydat1$xyCur_PUBLIC) ```
#### Example 2: Add a variable in plot table (PLOT_STATUS_CD) and output as a spatial object
View Example ```{r} xydat2 <- DBgetXY(states = "Wyoming", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE), pvars2keep = c("PLOT_STATUS_CD"), issp = TRUE) spxy2 <- xydat2$spxy ## Display points with by PLOT_STATUS_CD (1-light blue; 2-brown; 3-blue) spxy2$color <- ifelse(spxy2$PLOT_STATUS_CD == 2, "brown", ifelse(spxy2$PLOT_STATUS_CD == 3, "blue", "light blue")) plot(sf::st_geometry(spxy2['PLOT_STATUS_CD']), pch = 16, cex = .5, col = spxy2$color) ```
#### Example 3: Get XY data for Wyoming, inventory years 2015 to 2019
View Example ```{r} xydat3 <- DBgetXY(states = "Vermont", datsource = "datamart", eval = "custom", eval_opts = eval_options(invyrs = 2017:2019), issp = TRUE) spxy3 <- xydat3$spxy ## Display points plot(sf::st_geometry(spxy3), pch = 16, cex = .5, col="grey") ## Now only include P2 plots only (intensity1 = TRUE) xydat3b <- DBgetXY(states = "Vermont", datsource = "datamart", eval = "custom", eval_opts = eval_options(invyrs = 2017:2019), intensity1 = TRUE, issp = TRUE) spxy3b <- xydat3b$spxy ## Display points plot(sf::st_geometry(spxy3b), pch = 16, cex = .5) ```
###
`DBgetPlots()` The `DBgetPlots` function extracts plot-level data from FIA's online DataMart or SQLite database. #### Example 1: Get data for Rhode Island, most current FIA Evaluation, all plots.
View Example #### DBgetPlots() ```{r} dat1 <- DBgetPlots(states = "Rhode Island", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE, Type = "ALL"), issp = TRUE) names(dat1) plt1 <- dat1$tabs$plt spxy1 <- dat1$xyCur_PUBLIC table(plt1$INVYR) # Display spatial output plot(sf::st_geometry(spxy1), pch = 16, cex = .5) ``` ``` {r} # Add a filter to include only plots with Northern red oak forest type (FORTYPCD == 505) # Note: *allFilter* filters for plots and/or conditions for all states specified. dat1b <- DBgetPlots(states = "Rhode Island", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE, Type = "ALL"), issp = TRUE, allFilter = "FORTYPCD == 505") names(dat1b) spxy1b <- dat1b$xyCur_PUBLIC dim(spxy1b) # Display spatial output plot(sf::st_geometry(spxy1b), pch = 16, cex = .5, col="darkgreen") ```
#### Example 3: Get data for Delaware, most current FIA Evaluation, include plotgeom data and subplot tables
View Example ```{r} dat2 <- DBgetPlots(states = "Delaware", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE, Type = "ALL"), issubp = TRUE, addplotgeom = TRUE) names(dat2) tabs2 <- dat2$tabs plt2 <- tabs2$plt ## subplot and subp_cond tables are added to tabs list names(tabs2) ## PLOTGEOM data are appended to plt table (e.g., ALP_ADFORCD, FVS_VARIANT) head(plt2) ```
#### Example 3: Get data for Delaware, most current FIA Evaluation, include pop tables
View Example ```{r} dat3 <- DBgetPlots(states = "Delaware", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE, Type = "ALL"), savePOP = TRUE, othertables = c("POP_STRATUM", "POP_ESTN_UNIT")) ## savePOP = TRUE, saves the POP_PLOT_STRATUM_ASSGN table used to select plots names(dat3) ## pop_stratum and pop_estn_unit tables are added to tabs list tabs3 <- dat3$tabs names(tabs3) ```
#### Example 4: Export plot-level data to a CSV file
View Example ```{r} DBgetPlots(states = "Rhode Island", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE, Type = "ALL"), returndata = FALSE, savedata = TRUE, savedata_opts = savedata_options(outfolder = outfolder, out_fmt = "csv", overwrite_layer = TRUE)) ## Read in data from outfolder plt <- read.csv(file.path(outfolder, "plot.csv"), stringsAsFactors=FALSE) head(plt) ```
#### Example 5: Most current evaluation for multiple evalTypes ('ALL', 'VOL', 'GRM')
View Example ```{r} dat5 <- DBgetPlots(states = "Rhode Island", datsource = "datamart", eval = "FIA", eval_opts = eval_options(Cur = TRUE, Type = c("VOL", "CHNG", "P2VEG"))) names(dat5) tabs5 <- dat5$tabs names(tabs5) ppsa5 <- dat5$pop_plot_stratum_assgn table(ppsa5$EVALID) ```
#### Example 6: Get data for a set of evalids
View Example ```{r} dat6 <- DBgetPlots(eval = "FIA", eval_opts = eval_options(Cur = TRUE, evalid = c(101800, 101801, 101803))) names(dat6) tabs6 <- dat6$tabs names(tabs6) ppsa6 <- dat6$pop_plot_stratum_assgn table(ppsa6$EVALID) ```
#### Example 7: Get data by Endyr
View Example ```{r} dat7 <- DBgetPlots(states = c("Connecticut"), eval = "FIA", eval_opts = eval_options(evalType = "ALL", Endyr = 2017)) names(dat7) tabs7 <- dat7$tabs names(tabs7) ppsa7 <- dat7$pop_plot_stratum_assgn table(ppsa7$EVALID) ```
#### Example 8: Get data for multiple inventory years
View Example ```{r} dat8 <- DBgetPlots(states = "Vermont", eval = "custom", eval_opts = eval_options(invyrs = 2012:2014, evalType = "ALL")) names(dat8) tabs8 <- dat8$tabs names(tabs8) plt8 <- tabs8$plt table(plt8$INVYR) ```
#### Example 9: Get data for periodic inventory
View Example ```{r} dat9 <- DBgetPlots(states = "Wyoming", invtype = "PERIODIC", eval = "FIA", eval_opts = list(Cur = TRUE, evalType = "VOL")) names(dat9) tabs9 <- dat9$tabs names(tabs9) plt9 <- tabs9$plt table(plt9$STATECD, plt9$INVYR) ```
#### Example 10: Intensity
View Example The objective of this section is to understand the differences when using INTENSITY=1. ```{r} ## With only P2 plots (intensity1 = TRUE) dat10 <- DBgetPlots(states = "Vermont", eval = "FIA", eval_opts = list(Cur = TRUE, Type = "ALL"), intensity1 = TRUE, issp = TRUE) tabs10 <- dat10$tabs plt10 <- tabs10$plt table(plt10$INVYR) spxy10 <- dat10$xyCur_PUBLIC # Display spatial output of public coordinates plot(sf::st_geometry(spxy10), pch = 16, cex = .5) ``` ### `DBgetStrata()` The `DBgetStrata` function queries the FIA database for post-stratification information. #### Example1: Get strata for the most current evaluation for Wyoming
View Example #### DBgetStrata() ```{r} strat1 <- DBgetStrata(states = "Wyoming", eval_opts = eval_options(Cur = TRUE)) names(strat1) ## Look at plot assign data pltassgn1 <- strat1$pltassgn head(pltassgn1) unique(pltassgn1$EVALID) strat1$evalid ## Look at area data for estimation unit strat1$unitarea strat1$unitvar strat1$unitvar2 strat1$areavar ## Look at stratification data for estimation unit strat1$stratalut strat1$strvar strat1$getwtvar ```
#### Example 2: Get strata information for a specific evaluation for Wyoming
View Example ```{r} strat2 <- DBgetStrata(eval_opts = eval_options(evalid = 561200)) unique(strat2$pltassgn$EVALID) strat2$evalid ```
#### Example 3: Get strata information for Wyoming, evaluation ending in 2014
View Example ```{r} strat3 <- DBgetStrata(states = "Wyoming", eval_opts = eval_options(Endyr = 2014)) unique(strat3$pltassgn$EVALID) strat3$evalid ```