Help for package S4DM

Title:

Small Sample Size Species Distribution Modeling

Version:

0.0.1

Description:

Implements a set of distribution modeling methods that are suited to species with small sample sizes (e.g., poorly sampled species or rare species). While these methods can also be used on well-sampled taxa, they are united by the fact that they can be utilized with relatively few data points. More details on the currently implemented methodologies can be found in Drake and Richards (2018) <doi:10.1002/ecs2.2373>, Drake (2015) <doi:10.1098/rsif.2015.0086>, and Drake (2014) <doi:10.1890/ES13-00202.1>.

Depends:

R (≥ 3.5.0)

License:

MIT + file LICENSE

Encoding:

UTF-8

LazyData:

true

VignetteBuilder:

knitr

RoxygenNote:

7.3.2

Imports:

corpcor, densratio, flexclust, geometry, kernlab, maxnet, mvtnorm, np, pROC, robust, rvinecopulib, sf, terra, dplyr, Rdpack

Suggests:

geodata, BIEN, ggplot2, tidyterra, knitr, testthat, rmarkdown

RdMacros:

Rdpack

NeedsCompilation:

Packaged:

2025-01-10 00:20:14 UTC; Brian Maitner

Author:

Brian S. Maitner

[aut, cre], Robert L. Richards [aut], Ben S. Carlson [aut], John M. Drake [aut], Cory Merow [aut]

Maintainer:

Brian S. Maitner <bmaitner@usf.edu>

Repository:

CRAN

Date/Publication:

2025-01-10 21:00:02 UTC

Return scaled variables to the original scale using means and SDs

Description

A little function to rescale data using vectors of means and sds

Usage

descale_w_objects(data, mean_vector, sd_vector)

Arguments

data

dataframe or matrix for rescaling

mean_vector

vector of means to use for rescaling. Should be one value for each column in the data

sd_vector

vector of sds to use for rescaling. Should be one value for each column in the data

Author(s)

Brian Maitner

Density-ratio SDM estimation with Maxnet

Description

dr_maxnet is an internal function for density-ratio estimation with Maxnet

Usage

dr_maxnet(
  presence_data = NULL,
  background_data = NULL,
  projection_data = NULL,
  formula = NULL,
  regmult = 1,
  regfun = maxnet.default.regularization,
  addsamplestobackground = TRUE,
  clamp = TRUE,
  verbose = FALSE,
  method,
  type = c("link", "exponential", "cloglog", "logistic"),
  object = NULL
)

Arguments

presence_data

dataframe of covariates

background_data

dataframe of covariates

projection_data

dataframe of covariates

formula

Maxnet formula to use. Default (NULL) will use the Maxnet default. This parameter is called "f" in the maxnet function, but is renamed here as using "t" and "f" as object names is frowned upon.

regmult

Maxnet regularization multiplier. Default is 1.

regfun

Maxnet regularization function. Default is the Maxnet default.

addsamplestobackground

If TRUE (the default), any presences that aren't in the background will be added.

clamp

If TRUE (the default), predictions will be limited to ranges seen in the training dataset.

method

one of either "fit" or "predict"

type

Type of response required. Defaults to link, exponential, cloglog, and logistic.

object

fitted object returned by a dr_... function. Only needed when method = "predict"

Note

The options f, regmult, regfun, and addSamplestobackground are only used when method == "predict", the options clamp and type are only used when method == "predict". See the much better documentation for maxnet for more details.

Density-ratio SDM estimation with RuLSIF

Description

dr_rulsif is an internal function for density-ratio estimation with RuLSIF (Kanamori et al. 2009; Yamada et al. 2013).

Usage

dr_rulsif(
  presence_data = NULL,
  background_data = NULL,
  projection_data = NULL,
  sigma = 10^seq(-3, 1, length.out = 9),
  lambda = 10^seq(-3, 1, length.out = 9),
  alpha = 0.1,
  kernel_num = 100,
  verbose = FALSE,
  method,
  object = NULL
)

Arguments

presence_data

dataframe of covariates

background_data

dataframe of covariates

projection_data

dataframe of covariates

sigma

Sigma parameter for RuLSIF. Default is the RuLSIF default.

lambda

Lambda parameter for RuLSIF. Default is the RuLSIF default.

alpha

Relative parameter. Defaults to RuLSIF default.

kernel_num

kernel_number for RuLSIF. Default is the RuLSIF default.

method

one of either "fit" or "predict"

object

fitted object returned by a dr_... function. Only needed when method = "predict"

References

Kanamori T, Hido S, Sugiyama M (2009). “A least-squares approach to direct importance estimation.” J. Mach. Learn. Res., 10, 1391–1445. https://www.jmlr.org/papers/volume10/kanamori09a/kanamori09a.pdf.

Yamada M, Suzuki T, Kanamori T, Hachiya H, Sugiyama M (2013). “Relative Density-Ratio Estimation for Robust Distribution Comparison.” Neural Computation, 25(5), 1324–1370. http://dx.doi.org/10.1162/neco_a_00442.

Density-ratio SDM estimation with uLSIF

Description

dr_ulsif is an internal function for density-ratio estimation with uLSIF (Kanamori et al. 2009).

Usage

dr_ulsif(
  presence_data = NULL,
  background_data = NULL,
  projection_data = NULL,
  sigma = 10^seq(-3, 1, length.out = 9),
  lambda = 10^seq(-3, 1, length.out = 9),
  kernel_num = 100,
  verbose = FALSE,
  method,
  object = NULL
)

Arguments

presence_data

dataframe of covariates

background_data

dataframe of covariates

projection_data

dataframe of covariates

sigma

Sigma parameter for uLSIF. Default is the uLSIF default.

lambda

Lambda parameter for uLSIF. Default is the uLSIF default.

kernel_num

kernel_number for uLSIF. Default is the uLSIF default.

method

one of either "fit" or "predict"

object

fitted object returned by a dr_... function. Only needed when method = "predict"

References

Generate ensemble predictions from S4DM range maps

Description

This function evaluates model quality and creates an ensemble of the model outputs. This function uses 5-fold, spatially stratified, cross-validation to evaluate distribution model quality.

Usage

ensemble_range_map(
  occurrences,
  env,
  method = NULL,
  presence_method = NULL,
  background_method = NULL,
  bootstrap = "none",
  bootstrap_reps = 100,
  quantile = 0.05,
  constraint_regions = NULL,
  background_buffer_width = NULL,
  ...
)

Arguments

occurrences

Presence coordinates in long,lat format.

env

Environmental SpatRaster(s)

method

Optional. If supplied, both presence and background density estimation will use this method.

presence_method

Optional. Method for estimation of presence density.

background_method

Optional. Method for estimation of background density.

bootstrap

Character. One of "none" (the default, no bootstrapping), "numbag" (presence function is bootstrapped), or "doublebag" (presence and background functions are bootstrapped).

bootstrap_reps

Integer. Number of bootstrap replicates to use (default is 100)

quantile

Quantile to use for thresholding. Default is 0.05 (5 pct training presence). Set to 0 for minimum training presence (MTP).

constraint_regions

See get_env_bg documentation

background_buffer_width

Numeric or NULL. Width (meters or map units) of buffer to use to select background environment. If NULL, uses max dist between nearest occurrences.

...

Additional parameters passed to internal functions.

Details

Current plug-and-play methods include: "gaussian", "kde","vine","rangebagging", "lobagoc", and "none". Current density ratio methods include: "ulsif", "rulsif".

Value

List object containing elements (1) spatRaster ensemble layer showing the proportion of maps that are included in the range across the ensemble, (2) spatRasters for individual models, and (3) model quality information.

Note

Either method or both presence_method and background_method must be supplied.

Examples




# load in sample data

 library(S4DM)
 library(terra)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points

 # environmental data
   env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

 # rescale the environmental data

   env <- scale(env)

ensemble <- ensemble_range_map(occurrences = occurrences,
                               env = env,
                               method = NULL,
                               presence_method = c("gaussian", "kde"),
                               background_method = "gaussian",
                               quantile = 0.05,
                               background_buffer_width = 100000  )

Evaluate S4DM range map quality

Description

This function uses 5-fold, spatially stratified, cross-validation to evaluate distribution model quality.

Usage

evaluate_range_map(
  occurrences,
  env,
  method = NULL,
  presence_method = NULL,
  background_method = NULL,
  bootstrap = "none",
  bootstrap_reps = 100,
  quantile = 0.05,
  constraint_regions = NULL,
  background_buffer_width = NULL,
  standardize_preds = TRUE,
  ...
)

Arguments

occurrences

Presence coordinates in long,lat format.

env

Environmental SpatRaster(s)

method

Optional. If supplied, both presence and background density estimation will use this method.

presence_method

Optional. Method for estimation of presence density.

background_method

Optional. Method for estimation of background density.

bootstrap

Character. One of "none" (the default, no bootstrapping), "numbag" (presence function is bootstrapped), or "doublebag" (presence and background functions are bootstrapped).

bootstrap_reps

Integer. Number of bootstrap replicates to use (default is 100)

quantile

Quantile to use for thresholding. Default is 0.05 (5 pct training presence). Set to 0 for minimum training presence (MTP).

constraint_regions

See get_env_bg documentation

background_buffer_width

Numeric or NULL. Width (meters or map units) of buffer to use to select background environment. If NULL, uses max dist between nearest occurrences.

standardize_preds

Logical. Should environmental layers be scaled? Default is TRUE.

...

Additional parameters passed to internal functions.

Details

Current plug-and-play methods include: "gaussian", "kde","vine","rangebagging", "lobagoc", and "none". Current density ratio methods include: "ulsif", "rulsif".

Value

A list containing 1) a data.frame containing cross-validated model performance statistics (fold_results), and 2) a data.frame containing model performance statistics evaluated on the full dataset (overall_results).

Note

Either method or both presence_method and background_method must be supplied.

Examples

{

# load in sample data

 library(S4DM)
 library(terra)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points

 # environmental data
   env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

 # rescale the environmental data

   env <- scale(env)

# Evaluate a gaussian/gaussian model calculated with the numbag approach
# using 10 bootstrap replicates.

 evaluate_range_map(occurrences = occurrences,
                    env = env,
                    method = NULL,
                    presence_method = "gaussian",
                    background_method = "gaussian",
                    bootstrap = "numbag",
                    bootstrap_reps = 10,
                    quantile = 0.05,
                    constraint_regions = NULL,
                    background_buffer_width = 100000)



}

Fit density-ratio distribution models in a plug-and-play framework.

Description

This function fits density-ratio species distribution models for the specified density-ratio method (Drake and Richards 2018).

Usage

fit_density_ratio(presence = NULL, background = NULL, method = NULL, ...)

Arguments

presence

dataframe of covariates at presence points

background

Dataframe of covariates at background points

method

Character. See "notes" for options.

...

Additional parameters passed to internal functions.

Details

Current methods include: "ulsif", "rulsif", "maxnet"

Value

List of class "dr_model" containing model objects and metadata needed for projecting the fitted models.

References

Drake JM, Richards RL (2018). “Estimating environmental suitability.” Ecosphere, 9(9), e02373. https://onlinelibrary.wiley.com/doi/10.1002/ecs2.2373.

Examples



# load in sample data

 library(S4DM)
 library(terra)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points

 # environmental data
   env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

 # rescale the environmental data

   env <- scale(env)

 # Get presence environmental data

  pres_env <- get_env_pres(coords = occurrences,
                           env = env)

# Get background environmental data

 bg_env <- get_env_bg(coords = occurrences,
                      env = env,width = 100000)


# Note that the functions to get the environmental data return lists,
# and only the "env" element of these is used in the fit function

rulsif_fit <- fit_density_ratio(presence = pres_env$env,
                               background = bg_env$env,
                               method = "rulsif")

Fit presence-background distribution models in a plug-and-play framework.

Description

This function fits presence-background species distribution models for the specified plug-and-play methods (Drake and Richards 2018; Drake 2015).

Usage

fit_plug_and_play(
  presence = NULL,
  background = NULL,
  method = NULL,
  presence_method = NULL,
  background_method = NULL,
  bootstrap = "none",
  bootstrap_reps = 100,
  ...
)

Arguments

presence

dataframe of covariates at presence points

background

Optional. Dataframe of covariates at background points

method

Optional. If supplied, both presence and background density estimation will use this method.

presence_method

Optional. Method for estimation of presence density.

background_method

Optional. Method for estimation of background density.

bootstrap

Character. One of "none" (the default, no bootstrapping), "numbag" (presence function is bootstrapped), or "doublebag" (presence and background functions are bootstrapped).

bootstrap_reps

Integer. Number of bootstrap replicates to use (default is 100)

...

Additional parameters passed to internal functions.

Details

Current methods include: "gaussian", "kde","vine","rangebagging", "lobagoc", and "none".

Value

List of class "pnp_model" containing model objects and metadata needed for projecting the fitted models.

Note

Either method or both presence_method and background_method must be supplied.

References

Drake JM (2015). “Range bagging: a new method for ecological niche modelling from presence-only data.” J. R. Soc. Interface, 12(107). http://dx.doi.org/10.1098/rsif.2015.0086.

Drake JM, Richards RL (2018). “Estimating environmental suitability.” Ecosphere, 9(9), e02373. https://onlinelibrary.wiley.com/doi/10.1002/ecs2.2373.

Examples



# load in sample data

 library(S4DM)
 library(terra)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points

 # environmental data
   env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

 # rescale the environmental data

   env <- scale(env)

 # Get presence environmental data

  pres_env <- get_env_pres(coords = occurrences,
                           env = env)

# Get background environmental data

 bg_env <- get_env_bg(coords = occurrences,
                      env = env,width = 100000)


# Note that the functions to get the environmental data return lists,
# and only the "env" element of these is used in the fit function

  kde_fit <- fit_plug_and_play (presence = pres_env$env,
                                background = bg_env$env,
                                method = "kde")

Extract background data for SDM fitting.

Description

This function extracts background data around known presence records.

Usage

get_env_bg(
  coords,
  env,
  method = "buffer",
  width = NULL,
  constraint_regions = NULL,
  standardize = TRUE
)

Arguments

coords

Coordinates (long,lat) to extract values for

env

Environmental SpatRaster(s) in any projection

method

Methods for getting bg points. Current option is buffer

width

Numeric or NULL. Width (meters or map units) of buffer. If NULL, uses max dist between nearest occurrences.

constraint_regions

An optional spatialpolygons* object that can be used to limit the selection of background points.

standardize

Logical. If TRUE, the variables will be scaled and centered

Value

A list containing 1) the background data (env), 2) the cell indices for which the background was taken (buffer_cells), 3) the environmental means (env_mean; NA if standardization not done), and 4) the environmental standard deviations (env_sds; NA if standardization not done).

Note

If supplying constraint_regions, any polygons in which the occurrences fall are considered fair game for background selection. This background selection is, however, still limited by the buffer as well.

Examples

{

# load in sample data

 library(S4DM)
 library(terra)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points

 # environmental data
   env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

 # rescale the environmental data

   env <- scale(env)

bg_data <- get_env_bg(coords = occurrences,
                      env = env,
                      method = "buffer",
                      width = 100000)


}

Extract presence data for SDM fitting.

Description

This function extracts presence data at known presence records.

Usage

get_env_pres(coords, env, env_bg = NULL)

Arguments

coords

Coordinates (long,lat) to extract values for

env

Environmental SpatRaster(s) in any projection

env_bg

Background data produced by get_env_bg, used for re-scaling

Value

A list containing 1) the environmental data at the presence locations (env), and 2) an sf data.frame containing the occurrence records(occurrence_sf).

Examples

 {

# load in sample data

 library(S4DM)
 library(terra)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points

 # environmental data
   env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

 # rescale the environmental data

   env <- scale(env)

env_pres <- get_env_pres(coords = occurrences,
                        env = env)

}

Internal function for getting available function names.

Description

This function checks the available functions in the package to extract current options for dr and pnp fitting

Usage

get_functions(type = "pnp")

Arguments

type

Type of function to get. Options are "pnp" for presence/background functions and "dr" for ratio functions.

Generate Response Curves

Description

Given an environmental data set, fitted models, and a directory to output plots, this function generates response curves for each predictor in the model. The response curves depict the predicted change in probability of presence as a function of the environmental predictor while holding all other predictors constant at their mean values.

Usage

get_response_curves(
  env_bg,
  env_pres,
  pnp_model,
  n.int = 1000,
  envMeans = NULL,
  envSDs = NULL
)

Arguments

env_bg

Object returned by get_env_bg

env_pres

Object returned by get_env_pres

pnp_model

Object returned by fit_plug_and_play or fit_density_ratio

n.int

Number of points along which to calculate the response curve

envMeans

A vector of means for each environmental predictor in the dataset. (not used)

envSDs

A vector of standard deviations for each environmental predictor in the dataset.(not used)

Value

This function generates a set of marginal predictions for each environmental variable, holding other variables constant

Author(s)

Cory Merow, modified by Brian Maitner

Make a range map using plug-and-play modeling.

Description

This function produces range maps using plug-and-play modeling with either presence-background or density-ratio approaches.

Usage

make_range_map(
  occurrences,
  env,
  method = NULL,
  presence_method = NULL,
  background_method = NULL,
  bootstrap = "none",
  bootstrap_reps = 100,
  quantile = 0.05,
  background_buffer_width = NULL,
  constraint_regions = NULL,
  verbose = FALSE,
  standardize_preds = TRUE,
  ...
)

Arguments

occurrences

Presence coordinates in long,lat format.

env

Environmental rasters

method

Optional. If supplied, both presence and background density estimation will use this method.

presence_method

Optional. Method for estimation of presence density.

background_method

Optional. Method for estimation of background density.

bootstrap

Character. One of "none" (the default, no bootstrapping), "numbag" (presence function is bootstrapped), or "doublebag" (presence and background functions are bootstrapped).

bootstrap_reps

Integer. Number of bootstrap replicates to use (default is 100)

quantile

Quantile to use for thresholding. Default is 0.05 (5 pct training presence). Set to 0 for minimum training presence (MTP), set to NULL to return continuous raster.

background_buffer_width

The width (in m for unprojected rasters and map units for projected rasters) of the buffer to use for background data. Defaults to NULL, which will take the maximum distance between occurrence records.

constraint_regions

See get_env_bg documentation

verbose

Logical. If TRUE, prints progress messages.

standardize_preds

Logical. Should environmental layers be scaled? Default is TRUE.

...

Additional parameters passed to internal functions.

Details

Current plug-and-play methods include: "gaussian", "kde","vine","rangebagging", "lobagoc", and "none". Current density ratio methods include: "ulsif", "rulsif",and "maxnet".

Value

A SpatRaster object containing a range map. Maps may be either binary or continuous, depending upon the quantile argument.

Note

Either method or both presence_method and background_method must be supplied.

Examples

{

# load in sample data

 library(S4DM)
 library(terra)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points

 # environmental data
   env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

 # rescale the environmental data

   env <- scale(env)

   map <- make_range_map(occurrences = occurrences,
                         env = env,
                         method = "gaussian",
                         presence_method = NULL,
                         background_method = NULL,
                         bootstrap = "none",
                         bootstrap_reps = 100,
                         quantile = 0.05,
                         background_buffer_width = 100000)

   plot(map)


}

Internal function for fitting gaussian distributions in plug-and-play SDMs.

Description

This function both fits distributions and projects those distributions to new covariates..

Usage

pnp_gaussian(data, method, type = "regularized", object = NULL)

Arguments

data

dataframe of covariates

method

one of either "fit" or "predict"

type

one of either "classical", "robust", or "regularized" (the default)

object

fitted object returned by a pnp_... function. Only needed when method = "predict"

Internal function for fitting KDE distributions in plug-and-play SDMs.

Description

This function both fits Kernel Density Estimation (KDE) distributions and projects those distributions to new covariates..

Usage

pnp_kde(data, method, bwmethod = "normal-reference", object = NULL, ...)

Arguments

data

dataframe of covariates

method

one of either "fit" or "predict"

bwmethod

Bandwidth method to use. One of 'normal-reference' (the default),'cv.ml', or 'cv.ls'

object

fitted object returned by a pnp_... function. Only needed when method = "predict"

Internal function for fitting lobagoc distributions in plug-and-play SDMs.

Description

This function both fits lobagoc distributions (Drake 2014) and projects those distributions to new covariates.

Usage

pnp_lobagoc(data, method, object = NULL, v = 100, nu = 0.01, sigma = NULL)

Arguments

data

dataframe of covariates

method

one of either "fit" or "predict"

object

fitted object returned by a pnp_... function. Only needed when method = "predict"

v

Positive integer. The Number of votes to use (default is 100)

nu

Numeric. Tuning parameter for nu-svm

sigma

NULL or Numeric. Tuning parameter of rbf kernel, will estimate if left NULL (default).

Details

For fitting, an object is not required (and will be ignored). For prediction, parameters v,nu,and sigma are not needed and will be ignored.

References

Drake JM (2014). “Ensemble algorithms for ecological niche modeling from presence‐background and presence‐only data.” Ecosphere, 5(6), 1–16. doi:10.1890/es13-00202.1, https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1890/ES13-00202.1.

Internal function for returning empty pnp_estimate class models

Description

This function is used internally to transform presence-background models into presence-only models.

Usage

pnp_none(method, object = NULL, ...)

Arguments

method

one of either "fit" or "predict"

object

fitted object returned by a pnp_... function. Only needed when method = "predict"

Internal function for rangebagging in plug-and-play SDMs.

Description

This function both fits rangebagging models (Drake 2015) and projects those distributions to new covariates.

Usage

pnp_rangebagging(data, method, object = NULL, v = 100, d = 2, p = 0.5)

Arguments

data

dataframe of covariates

method

one of either "fit" or "predict"

object

fitted object returned by a pnp_... function. Only needed when method = "predict"

v

Integer. Number of votes to use in the aggregation, default is 100.

d

Integer. Number of dimensions (i.e. covariates) to use in aggregations, default is 2.

p

Numeric. Fraction of observations (i.e. occurrences) to use in each replicate aggregation. Default is 0.5

Details

For fitting, an object is not required (and will be ignored). For prediction, parameters v,p,and d are not needed and will be ignored.

References

Drake JM (2015). “Range bagging: a new method for ecological niche modelling from presence-only data.” J. R. Soc. Interface, 12(107). http://dx.doi.org/10.1098/rsif.2015.0086.

Internal function for fitting vine copula distributions in plug-and-play SDMs.

Description

This function both fits distributions and projects those distributions to new covariates.

Usage

pnp_vine(data, method, object = NULL)

Arguments

data

dataframe of covariates

method

one of either "fit" or "predict"

object

fitted object returned by a pnp_... function. Only needed when method = "predict"

Projects fitted density-ratio distribution models onto new covariates.

Description

This function projects fitted density-ratio species distribution models onto new covariates.

Usage

project_density_ratio(dr_model, data)

Arguments

dr_model

A fitted density ratio model produced by fit_density_ratio

data

covariate data

Value

A vector of relative occurrence rates evaluated at the covariates supplied in the data object.

Projects fitted plug-and-play distribution models onto new covariates.

Description

This function projects fitted plug-and-play species distribution models onto new covariates.

Usage

project_plug_and_play(pnp_model, data)

Arguments

pnp_model

A fitted plug-and-play model produced by fit_plug_and_play

data

covariate data

Value

A vector of relative occurrence rates evaluated at the covariates supplied in the data object.

Note

The tsearchn function underlying rangebagging seems to fail sometimes with very uneven predictors. Rescaling helps.

Rescale a dataset using vectors of means and SDs

Description

A little function to rescale data using vectors of means and sds

Usage

rescale_w_objects(data, mean_vector, sd_vector)

Arguments

data

dataframe or matrix for rescaling

mean_vector

vector of means to use for rescaling. Should be one value for each column in the data

sd_vector

vector of sds to use for rescaling. Should be one value for each column in the data

Author(s)

Brian Maitner

Example S4DM occurrence data

Description

A sample dataset containing occurrence records.

Usage

sample_points

Format

A data.frame with 65 observations of 2 variables:

Longitude: Longitude, in decimal degrees
Latitude: Latitude, in decimal degrees

...

Source

https://biendata.org

Thresholds a continuous relative occurrence rate raster to create a binary raster.

Description

This function thresholds a continuous relative occurrence rate raster to produce a binary presence/absence raster.

Usage

sdm_threshold(
  prediction_raster,
  occurrence_sf,
  quantile = 0.05,
  return_binary = TRUE
)

Arguments

prediction_raster

Raster containing continuous predictions of relative occurrence rate to be thresholded.

occurrence_sf

An sf object containing presence locations. Should be in the projection of the prediction raster

quantile

Numeric between 0 and 1. Quantile to use for thresholding (defaults to 0.05). Set to 0 for minimum training presence.

return_binary

LOGICAL. Should the raster returned be binary (presence/absence)? If FALSE, predicted presences will retain their 'suitability" scores.

Value

A SpatRaster object containing a range map. Maps may be either binary or continuous, depending upon the return_binary argument.

Author(s)

Cecina Babich Morrow (modified by Brian Maitner)

Examples

{

# load in sample data

library(S4DM)
library(terra)

# occurrence points
  data("sample_points")
  occurrences <- sample_points

# environmental data
  env <- rast(system.file('ex/sample_env.tif', package="S4DM"))

# rescale the environmental data

  env <- scale(env)

 bg_data <- get_env_bg(coords = occurrences,
                       env = env,
                       method = "buffer",
                       width = 100000)

 pres_data <- get_env_pres(coords = occurrences,
                           env = env)

 pnp_model <-fit_plug_and_play(presence = pres_data$env,
                   background = bg_data$env,
                   method = "gaussian")

 pnp_continuous <- project_plug_and_play(pnp_model = pnp_model,
                                         data = bg_data$env)

 #Make an empty raster to populate
 out_raster <- env[[1]]
 values(out_raster) <- NA

 # use the bg_data for indexing
 out_raster[bg_data$bg_cells] <- pnp_continuous

 plot(out_raster)

 #convert to a binary raster

 out_raster_binary <-
   sdm_threshold(prediction_raster = out_raster,
               occurrence_sf = pres_data$occurrence_sf,
               quantile = 0.05,
               return_binary = TRUE)

 plot(out_raster_binary)

}

Split data for k-fold spatially stratified cross validation

Description

Splitting tool for cross-validation

Usage

stratify_random(occurrence_sf, nfolds = NULL)

Arguments

occurrence_sf

a sf object containing occurrence records

nfolds

number of desired output folds.

Details

See Examples.

Value

Returns a sf dataframe containing fold designation for each point.

Author(s)

Cory Merow cory.merow@gmail.com

Examples

{

# load in sample data

 library(S4DM)
 library(terra)
 library(sf)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points


 occurrences <- st_as_sf(x = occurrences,coords = c(1,2))


random_folds <- stratify_random(occurrence_sf = occurrences,
                               nfolds = 5)


}

Split data for k-fold spatially stratified cross validation

Description

Splitting tool for cross-validation

Usage

stratify_spatial(occurrence_sf, nfolds = NULL, nsubclusters = NULL)

Arguments

occurrence_sf

a sf object containing occurrence points

nfolds

number of desired output folds. Default value of NULL makes a reasonable guess based on sample size.

nsubclusters

intermediate number of clusters randomly split into nfolds. Default value of NULL makes a reasonable guess based on sample size. If you specify this manually, it should be an integer multiple of nfolds.

Details

See Examples.

Value

Returns a SpatialPoints dataframe with the data.frame containing fold designation for each point.

Author(s)

Cory Merow cory.merow@gmail.com

Examples

{

# load in sample data

 library(S4DM)
 library(terra)
 library(sf)

 # occurrence points
   data("sample_points")
   occurrences <- sample_points


 occurrences <- st_as_sf(x = occurrences,coords = c(1,2))

manual <- stratify_spatial(occurrence_sf = occurrences,nfolds = 5,nsubclusters = 5)
default <- stratify_spatial(occurrence_sf = occurrences)


}