---
title: "Database Tools"
output: rmarkdown::html_vignette
description: >
Explore the various database utility functions available within FIESTA.
vignette: >
%\VignetteIndexEntry{Database Tools}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
```{r setup, include = F}
library(knitr)
knitr::opts_chunk$set(message = F, warning = F, eval = F)
```
```{r, include=FALSE}
# Sets up output folding
hooks = knitr::knit_hooks$get()
hook_foldable = function(type) {
force(type)
function(x, options) {
res = hooks[[type]](x, options)
if (isFALSE(options[[paste0("fold.", type)]])) return(res)
paste0(
"", type, "
\n\n",
res,
"\n\n "
)
}
}
knitr::knit_hooks$set(
output = hook_foldable("output"),
plot = hook_foldable("plot")
)
```
```{r, echo=-1}
data.table::setDTthreads(2)
```
### Overview of FIESTA Database (DB) tools
FIESTA's DB tools extract data from FIA's online publicly-available, comma-delimited files (\*.csv or \*.zip). FIA's CSV files are available by state from the FIA DataMart at the following link: https://apps.fs.usda.gov/fia/datamart/datamart.html. Because of FIA's confidentiality agreement to protect the privacy of landowners, as well as protecting the scientific integrity of FIA's sample design, the exact coordinates of the sample plot locations are not included in the public data. If the exact coordinates are necessary for your analysis, contact FIA's Spatial Data Services (https://research.fs.usda.gov/programs/fia/sds).
### Objective of tutorial
The objective of this tutorial is to demonstrate the use of FIESTA's DB tools for accessing FIA data. These tools extract data from FIA Datamart using FIA's standard evaluations as well as customized evaluations.
An FIA Evaluation is a group of plots within the FIA database that is used for population estimates. An FIA Evaluation represents different inventory spans of data with different stratification and area adjustments for nonreponse. Each Evaluation is determined by the type of estimation (evalType) including: area and tree estimates, growth and mortality estimates, and area change estimates (evalType). These plots are identified by an evalid, which is a unique identifier in the format of a 2-digit State code, a 2-digit year code, and a 2-digit evaluation type code. For example, EVALID '491601' represents the Utah 2016 evaluation for current area estimates.
FUNCTION | DESCRIPTION
-------------- | ---------------------------------------------------------------
[DBgetCSV()](#DBgetCSV) | Downloads comma-delimited file (.csv) or downloads and extracts a compressed csv file (.zip) from FIA's online DataMart.
[DBqryCSV()](#DBqryCSV) | Extracts and queries data from FIA's online DataMart, either CSV or ZIP files.
[DBgetEvalid()](#DBgetEvalid) | Gets evalid for identifying an estimation group of plots for state or checks evalid.
[DBgetXY()](#DBgetXY) | Extracts XY data from FIA database.
[DBgetPlots()](#DBgetPlots) | Extracts inventory plot data from FIA database.
[DBgetStrata()](#DBgetStrata) | Extracts strata information and total acres by estimation unit from FIA database, including plot-level assignment and a data frame with strata weights by estimation unit.
### Set up
First, you'll need to load the `FIESTA` library:
```{r, warning = F, message = F}
library(FIESTA)
```
Next, you'll need to set up an "outfolder". This is just a file path to a folder where you'd like `FIESTA` to send your data output. For this vignette, we have saved our outfolder file path as the `outfolder` object in a temporary directory. We also set a few default options preferred for this vignette.
```{r}
outfolder <- tempdir()
```
## DB Examples
The following examples show how to extract data from FIA's publicly-available, online DataMart. Data can be returned as R objects or exported to CSV (.csv) files or a SQLite (.sqlite) database. The zip files are extracted on-the-fly from the online website. Web server connections will affect download speeds. We show examples for the following functions:
The following examples extract data from FIA's online DataMart (https://apps.fs.usda.gov/fia/datamart/datamart.html).
Note that while `datsource = 'datamart'` is utilized for these examples, `datsource` can be set to 'sqlite' with `datsource_dsn` set to the local file path for the FIADB file to achieve the same results.
### `DBgetCSV()`
The `DBgetCSV` function extracts data from FIA's publicly-available, online DataMart CSV/ZIP files. The zip files are extracted on-the-fly from the online website. Web server connections will affect download speeds.
#### Example 1: Extract PLOT data for Wyoming and Utah
View Example
#### DBgetCSV()
```{r}
## Get plot table for Wyoming
WYplots <- DBgetCSV("PLOT", "Wyoming")
dim(WYplots)
## Get plot table for Wyoming and Utah
WYUTplots <- DBgetCSV(DBtable = "PLOT",
states = c("Wyoming", "Utah"))
table(WYUTplots$STATECD)
## Get survey table for Wyoming
WYsurvey <- DBgetCSV("SURVEY", "Wyoming")
WYsurvey
```
### `DBqryCSV()`
The `DBqryCSV` function queries a table from FIA's online publicly-available DataMart. The tables in the query must be specified in the sqltables parameter.
#### Example: Multiple Uses
View Example
#### DBqryCSV()
```{r}
# Get number of plots by inventory year for the state of Wyoming
sql1 <- "SELECT INVYR, COUNT(*) AS NBRPLOTS
FROM PLOT
WHERE statecd = 56
GROUP BY INVYR"
nplots1 <- DBqryCSV(sql = sql1,
states = "Wyoming",
sqltables = "PLOT")
head(nplots1)
# Get number of plots by inventory year for Vermont and New Hampshire
sql2 <- "SELECT STATECD, INVYR, COUNT(*) NBRPLOTS
FROM PLOT
WHERE statecd IN(50,33)
GROUP BY STATECD, INVYR"
nplots2 <- DBqryCSV(sql = sql2,
states = c("Vermont", "New Hampshire"),
sqltables = "PLOT")
head(nplots2)
# Get number of plots by inventory year for Iowa (stcd=19) that have silver maple (SPCD=317)
sql3 <- "SELECT p.STATECD, p.INVYR, COUNT(*) NBRPLOTS
FROM PLOT p
JOIN TREE t ON p.CN = t.PLT_CN
WHERE p.statecd = 19 AND t.SPCD = 317
GROUP BY p.STATECD, p.INVYR"
nplots3 <- DBqryCSV(sql = sql3,
states = "IOWA",
sqltables = c("PLOT", "TREE"))
head(nplots3)
```
### `DBgetEvalid()`
The `DBgetEvalid` function gets information for FIA Evaluations.
#### Example 1: Get most current evalid and inventory years for Wyoming
View Example
```{r}
WYeval <- DBgetEvalid(states = "Wyoming",
evalCur = TRUE)
names(WYeval)
WYeval$evalidlist
WYeval$invyrs
WYeval$invyrtab
WYeval$invtype
```
#### Example 2: Get most current evaluations for New York for VOL and GRM evalTypes
View Example
```{r}
NYeval <- DBgetEvalid(states = c("New York"),
evalType = c("VOL", "GRM"),
evalCur = TRUE)
names(NYeval)
NYeval$evalidlist
NYeval$evalTypelist
```
### `DBgetXY()`
The `DBgetXY` function queries XY public coordinate data from FIA' online publicly-available DataMart or SQLite database.
#### Example1: Get xy data for the state of Wyoming for the most current evaluation
View Example
#### DBgetXY()
```{r}
xydat1 <- DBgetXY(states = "Wyoming",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE))
names(xydat1)
head(xydat1$xyCur_PUBLIC)
```
#### Example 2: Add a variable in plot table (PLOT_STATUS_CD) and output as a spatial object
View Example
```{r}
xydat2 <- DBgetXY(states = "Wyoming",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE),
pvars2keep = c("PLOT_STATUS_CD"),
issp = TRUE)
spxy2 <- xydat2$spxy
## Display points with by PLOT_STATUS_CD (1-light blue; 2-brown; 3-blue)
spxy2$color <- ifelse(spxy2$PLOT_STATUS_CD == 2, "brown",
ifelse(spxy2$PLOT_STATUS_CD == 3, "blue", "light blue"))
plot(sf::st_geometry(spxy2['PLOT_STATUS_CD']), pch = 16, cex = .5,
col = spxy2$color)
```
#### Example 3: Get XY data for Wyoming, inventory years 2015 to 2019
View Example
```{r}
xydat3 <- DBgetXY(states = "Vermont",
datsource = "datamart",
eval = "custom",
eval_opts = eval_options(invyrs = 2017:2019),
issp = TRUE)
spxy3 <- xydat3$spxy
## Display points
plot(sf::st_geometry(spxy3), pch = 16, cex = .5, col="grey")
## Now only include P2 plots only (intensity1 = TRUE)
xydat3b <- DBgetXY(states = "Vermont",
datsource = "datamart",
eval = "custom",
eval_opts = eval_options(invyrs = 2017:2019),
intensity1 = TRUE,
issp = TRUE)
spxy3b <- xydat3b$spxy
## Display points
plot(sf::st_geometry(spxy3b), pch = 16, cex = .5)
```
### `DBgetPlots()`
The `DBgetPlots` function extracts plot-level data from FIA's online DataMart or SQLite database.
#### Example 1: Get data for Rhode Island, most current FIA Evaluation, all plots.
View Example
#### DBgetPlots()
```{r}
dat1 <- DBgetPlots(states = "Rhode Island",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE,
Type = "ALL"),
issp = TRUE)
names(dat1)
plt1 <- dat1$tabs$plt
spxy1 <- dat1$xyCur_PUBLIC
table(plt1$INVYR)
# Display spatial output
plot(sf::st_geometry(spxy1), pch = 16, cex = .5)
```
``` {r}
# Add a filter to include only plots with Northern red oak forest type (FORTYPCD == 505)
# Note: *allFilter* filters for plots and/or conditions for all states specified.
dat1b <- DBgetPlots(states = "Rhode Island",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE,
Type = "ALL"),
issp = TRUE,
allFilter = "FORTYPCD == 505")
names(dat1b)
spxy1b <- dat1b$xyCur_PUBLIC
dim(spxy1b)
# Display spatial output
plot(sf::st_geometry(spxy1b), pch = 16, cex = .5, col="darkgreen")
```
#### Example 3: Get data for Delaware, most current FIA Evaluation, include plotgeom data and subplot tables
View Example
```{r}
dat2 <- DBgetPlots(states = "Delaware",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE,
Type = "ALL"),
issubp = TRUE,
addplotgeom = TRUE)
names(dat2)
tabs2 <- dat2$tabs
plt2 <- tabs2$plt
## subplot and subp_cond tables are added to tabs list
names(tabs2)
## PLOTGEOM data are appended to plt table (e.g., ALP_ADFORCD, FVS_VARIANT)
head(plt2)
```
#### Example 3: Get data for Delaware, most current FIA Evaluation, include pop tables
View Example
```{r}
dat3 <- DBgetPlots(states = "Delaware",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE,
Type = "ALL"),
savePOP = TRUE,
othertables = c("POP_STRATUM", "POP_ESTN_UNIT"))
## savePOP = TRUE, saves the POP_PLOT_STRATUM_ASSGN table used to select plots
names(dat3)
## pop_stratum and pop_estn_unit tables are added to tabs list
tabs3 <- dat3$tabs
names(tabs3)
```
#### Example 4: Export plot-level data to a CSV file
View Example
```{r}
DBgetPlots(states = "Rhode Island",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE,
Type = "ALL"),
returndata = FALSE,
savedata = TRUE,
savedata_opts = savedata_options(outfolder = outfolder,
out_fmt = "csv",
overwrite_layer = TRUE))
## Read in data from outfolder
plt <- read.csv(file.path(outfolder, "plot.csv"), stringsAsFactors=FALSE)
head(plt)
```
#### Example 5: Most current evaluation for multiple evalTypes ('ALL', 'VOL', 'GRM')
View Example
```{r}
dat5 <- DBgetPlots(states = "Rhode Island",
datsource = "datamart",
eval = "FIA",
eval_opts = eval_options(Cur = TRUE,
Type = c("VOL", "CHNG", "P2VEG")))
names(dat5)
tabs5 <- dat5$tabs
names(tabs5)
ppsa5 <- dat5$pop_plot_stratum_assgn
table(ppsa5$EVALID)
```
#### Example 6: Get data for a set of evalids
View Example
```{r}
dat6 <- DBgetPlots(eval = "FIA",
eval_opts = eval_options(Cur = TRUE,
evalid = c(101800, 101801, 101803)))
names(dat6)
tabs6 <- dat6$tabs
names(tabs6)
ppsa6 <- dat6$pop_plot_stratum_assgn
table(ppsa6$EVALID)
```
#### Example 7: Get data by Endyr
View Example
```{r}
dat7 <- DBgetPlots(states = c("Connecticut"),
eval = "FIA",
eval_opts = eval_options(evalType = "ALL",
Endyr = 2017))
names(dat7)
tabs7 <- dat7$tabs
names(tabs7)
ppsa7 <- dat7$pop_plot_stratum_assgn
table(ppsa7$EVALID)
```
#### Example 8: Get data for multiple inventory years
View Example
```{r}
dat8 <- DBgetPlots(states = "Vermont",
eval = "custom",
eval_opts = eval_options(invyrs = 2012:2014,
evalType = "ALL"))
names(dat8)
tabs8 <- dat8$tabs
names(tabs8)
plt8 <- tabs8$plt
table(plt8$INVYR)
```
#### Example 9: Get data for periodic inventory
View Example
```{r}
dat9 <- DBgetPlots(states = "Wyoming",
invtype = "PERIODIC",
eval = "FIA",
eval_opts = list(Cur = TRUE,
evalType = "VOL"))
names(dat9)
tabs9 <- dat9$tabs
names(tabs9)
plt9 <- tabs9$plt
table(plt9$STATECD, plt9$INVYR)
```
#### Example 10: Intensity
View Example
The objective of this section is to understand the differences when using INTENSITY=1.
```{r}
## With only P2 plots (intensity1 = TRUE)
dat10 <- DBgetPlots(states = "Vermont",
eval = "FIA",
eval_opts = list(Cur = TRUE,
Type = "ALL"),
intensity1 = TRUE,
issp = TRUE)
tabs10 <- dat10$tabs
plt10 <- tabs10$plt
table(plt10$INVYR)
spxy10 <- dat10$xyCur_PUBLIC
# Display spatial output of public coordinates
plot(sf::st_geometry(spxy10), pch = 16, cex = .5)
```
### `DBgetStrata()`
The `DBgetStrata` function queries the FIA database for post-stratification information.
#### Example1: Get strata for the most current evaluation for Wyoming
View Example
#### DBgetStrata()
```{r}
strat1 <- DBgetStrata(states = "Wyoming",
eval_opts = eval_options(Cur = TRUE))
names(strat1)
## Look at plot assign data
pltassgn1 <- strat1$pltassgn
head(pltassgn1)
unique(pltassgn1$EVALID)
strat1$evalid
## Look at area data for estimation unit
strat1$unitarea
strat1$unitvar
strat1$unitvar2
strat1$areavar
## Look at stratification data for estimation unit
strat1$stratalut
strat1$strvar
strat1$getwtvar
```
#### Example 2: Get strata information for a specific evaluation for Wyoming
View Example
```{r}
strat2 <- DBgetStrata(eval_opts = eval_options(evalid = 561200))
unique(strat2$pltassgn$EVALID)
strat2$evalid
```
#### Example 3: Get strata information for Wyoming, evaluation ending in 2014
View Example
```{r}
strat3 <- DBgetStrata(states = "Wyoming",
eval_opts = eval_options(Endyr = 2014))
unique(strat3$pltassgn$EVALID)
strat3$evalid
```