--- title: "bean: an overview" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{bean: an overview} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, out.width = "90%" ) ``` `bean` reduces sampling bias in species occurrence data by thinning it in **environmental space** rather than in geographic space. The result is a cleaner training set for species distribution models (SDM / ENM). The protocol is: 1. **Prepare** raw occurrences with `prepare_bean()`. 2. **Choose a grid resolution** with `find_env_resolution()`, which selects a kernel-density bandwidth for each environmental variable. 3. **Thin** occurrences with `thin_env_nd()` (stochastic) or `thin_env_center()` (deterministic). 4. **Fit a niche ellipsoid** with `fit_ellipsoid()`. 5. **Predict suitability** with `predict()` on the fitted ellipsoid. ```{r setup} library(bean) ``` ## Quickstart ```{r quickstart} data(origin_dat_prepared, package = "bean") env_vars <- c("bio_1", "bio_4", "bio_12", "bio_15") # 1. Pick an objective grid resolution from the data res <- find_env_resolution(origin_dat_prepared, env_vars = env_vars) res # 2. Thin in environmental space thinned <- thin_env_nd( data = origin_dat_prepared, env_vars = env_vars, grid_resolution = res$suggested_resolution, seed = 1 ) thinned ``` The remaining vignettes walk through each step in detail.