CRAN Task View: Paleontology
Maintainer: | William Gearty, Lewis A. Jones, Erin Dillon, Pedro Godoy, Harriet Drage, Christopher Dean, Bruna Farina |
Contact: | willgearty at gmail.com |
Version: | 2024-11-27 |
URL: | https://CRAN.R-project.org/view=Paleontology |
Source: | https://github.com/cran-task-views/Paleontology/ |
Contributions: | Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide. |
Citation: | William Gearty, Lewis A. Jones, Erin Dillon, Pedro Godoy, Harriet Drage, Christopher Dean, Bruna Farina (2024). CRAN Task View: Paleontology. Version 2024-11-27. URL https://CRAN.R-project.org/view=Paleontology. |
Installation: | The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("Paleontology", coreOnly = TRUE) installs all the core packages or ctv::update.views("Paleontology") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details. |
Overview
Computational Paleontology is an emerging field. Paleontologists are increasingly turning to a wide array of complex computational analyses to address various research questions and test hypotheses. Until recently, paleontologists have mostly leveraged resources designed for evolutionary biologists, ecologists, geographers, and data scientists to accomplish such analyses. However, R resources are now being developed that cater to paleontological tasks and datasets.
We have assembled this task view to bring together R packages that are specifically geared towards acquiring, cleaning, visualizing, and/or analyzing various kinds of paleontological and paleontology-adjacent data. We use this venue to showcase the wide variety of R packages available across the paleosciences and to provide a brief overview of each package for a broad audience of R users.
If you have any questions, feel free to reach out to the task view maintainers via e-mail or by opening a GitHub issue in the repository (see link above). General questions may also be directed to the palaeoverse google group or the paleonet mailing list. Technical questions about a specific package should be directed to the maintainer of that package.
Scope
Packages within the task view fall within one or more of the following broad categories:
- Wrangling paleontological data: packages dedicated to the acquisition, cleaning, manipulation, and/or visualization of paleontological data
- Paleoecology and morphological evolution: packages that are useful for performing paleoecological and morphological analyses
- Paleobiogeography and biodiversity: packages that are useful for performing paleobiogeographical and/or paleobiodiversity analyses
- Phylogenetics: packages that are useful for performing phylogenetic analyses that include paleontological data
- Time series analysis: packages that are useful for performing time series analyses of paleontological data
- Stratigraphy and sedimentology: packages that are useful for acquiring, analyzing, and visualizing stratigraphic or sedimentological data
- Paleoclimate: packages that are useful for acquiring and analyzing paleoclimatic data
Wrangling paleontological data
Acquiring paleontological data
- paleobioDB has functions to query, download, process, and visualize occurence and taxonomic data from the Paleobiology Database (PBDB).
- neotoma2 can query, download, and manipulate data from the Neotoma Paleoecology Database, which specializes in fossil data holdings at timescales covering the last several decades to the last several million years.
- rgbif can query and download biological and paleontological occurrence data from the Global Biodiversity Information Facility (GBIF).
- ridigbio can query and download biological and paleontological specimen record data from iDigBio.
- sepkoski contains data on the stratigraphic ranges of fossil marine animal genera from Sepkoski’s (2002) published compendium.
- folio contains datasets for teaching quantitative approaches and modeling in archaeology and paleontology.
- chronosphere can download time-stamped versions of various paleontological, paleoenvironmental, and paleoecological databases, including BioDeepTime (Smith et al. 2023), Triton (Fenton et al. 2021), the Paleobiology Database, and the Ancient Reef Traits Database.
Cleaning and/or manipulating paleontological data
- palaeoverse has functionality to support data preparation and exploration for paleobiological analyses with a focus on improving code flow, reproducibility, and accessibility.
- CoordinateCleaner can perform automated flagging of common spatial and temporal errors in biological and paleontological collection data.
- fossilbrush can perform automated detection and resolution of taxonomic and stratigraphic errors in fossil occurrence datasets.
- rgplates can query the GPlates desktop application and web service API to reconstruct past positions of geographic entities (e.g., plates, coastlines, and coordinates) based on user-selected rotation models. The
palaeorotate
function in palaeoverse can be used to query the GPlates API for fossil occurrences across multiple time intervals.
Visualizing paleontological data
- deeptime extends the functionality of the
ggplot2
package to help facilitate the plotting of data over long time intervals. Several functions are available to add highly customizable timescales to a variety of types of visualizations.
- palaeoverse has functions for visualizing occurrence data through time and across space in base R. The
axis_geo
function can be used to add a timescale to a base R plot.
- GEOmap includes a set of routines for making map projections, topographic maps, perspective plots, and geological maps.
- rphylopic can query and fetch silhouettes of extant and extinct organisms from the PhyloPic database.
Paleoecology and morphological evolution
Paleoenvironmental reconstruction
- rioja implements a number of numerical methods for inferring the value of an environmental variable from a set of species abundances.
- analogue has functions for the prediction of environmental data from species data by fitting Modern Analogue Technique and Weighted Averaging transfer function models.
Users may also find packages in the Environmetrics task view useful for analyzing ecological and environmental data.
Quantifying ecological and morphological evolution
- ecospace implements Monte Carlo simulations of ecological diversification models, using a user-specified ecospace (trait space) framework.
- fossil has functions for estimating shared species/beta diversity, species area curves, and geographic distances and areas.
- R-Fossilpol-package has functions for processing and standardizing global paleoecological pollen data.
- Morphoscape implements adaptive landscape methods (first described by Polly et al. 2016) for the integration, analysis and visualization of biological trait data on a phenotypic morphospace.
- RRphylo can be used to estimate variation and shift in the rate of phenotypic evolution with fossil data using phylogenetic ridge regression.
- paleoTS, evoTS, and adePEM have functions for fitting evolutionary models to morphological time series (see Time series analysis for more information).
Also see the Phylogenetics task view for details about studying discrete and continuous morphological evolution in a phylogenetic context.
Paleobiogeography and biodiversity
- Compadre can be used to estimate rates of speciation/origination, extinction, and sampling using Bayesian capture-mark-recapture techniques.
- divDyn has functions to describe the sampling and diversity dynamics of fossil occurrence datasets.
- fossil has functions for estimating species richness (Chao 1 and 2, ACE, ICE, Jacknife) and shared species/beta diversity.
- divvy has functions to conduct spatial subsampling for biogeography and biodiversity studies and calculate common biodiversity and range-size metrics.
- hespdiv has functions to conduct hierarchical spatial sampling and perform analysis and visualization of these samples.
- ppgm can be used to conduct paleophylogeographic modeling of climate niches and species distributions.
- CoordinateCleaner can be used to generate inputs for PyRate, a program in Python that can estimate speciation and extinction rates from incomplete fossil data (Silvestro et al. 2014).
Users may also find packages in the Spatial task view useful for analyzing paleobiogeography.
Phylogenetics
- paleotree provides tools for transforming, a posteriori time-scaling, and modifying phylogenies containing extinct lineages.
- FossilSim can be used to simulate fossil occurrence data on phylogenetic trees under mechanistic models of speciation, fossil preservation, and fossil recovery. Quick simulations can be conducted in a graphical user interface with FossilSimShiny.
- paleobuddy can be used to simulate phylogenetic trees and fossil records with custom speciation, extinction, and fossil sampling rates.
- fbdR has functions for estimating speciation and extinction rates from phylogenetic trees and fossil occurrence data.
- cladedate has functions to use a MonteCarlo approach to generate empirical calibration information from the fossil record.
- RRphylo can be used to estimate variation and shift in the rate of phenotypic evolution with fossil data using phylogenetic ridge regression.
- strap has functions for the stratigraphic analysis of phylogenetic trees.
Also see the Phylogenetics task view for broader details about conducting various analyses in a phylogenetic context.
Time series analysis
- paleoTS facilitates the analysis of paleontological temporal sequences of trait values by fitting evolutionary models using maximum likelihood.
- evoTS facilitates univariate and multivariate analysis of evolutionary sequences of phenotypic change over time. The package extends the modeling framework available in paleoTS.
- adePEM has functions for assessing the adequacy of models of phenotypic change within lineages, like those fit by paleoTS and evoTS.
- StratPal can be used to simulate biological processes in the time domain (e.g., trait evolution, fossil abundance), and examine how their expression in the rock record (stratigraphic domain) is influenced based on age-depth models, ecological niche models, and taphonomic effects.
- astrochron can conduct routines for astrochronologic testing, astronomical time scale construction, and time series analysis. Also included are a range of statistical analysis and modeling routines that are relevant to time scale development and paleoclimate analysis.
- R-Ratepol-package has functions for estimating rate of change (RoC) from time series of community data.
Also see the TimeSeries task view for broader details about conducting time series analyses.
Stratigraphy and sedimentology
Acquiring stratigraphic and sedimentological data
- rmacrostrat can fetch geological data from Macrostrat relevant to the spatial and temporal distribution of sedimentary, igneous, and metamorphic rocks as well as data extracted from them.
Analyzing stratigraphic and sedimentological data
- admtools can be used to estimate age-depth models from stratigraphic and sedimentological data.
- clam can be used to perform ‘classical’ age-depth modeling of dated sediment deposits.
- rbacon can be used to perform age-depth modeling using Bayesian statistics to reconstruct accumulation histories for deposits, through combining radiocarbon and other dates with prior information on accumulation rates and their variability.
- Bchron has functions for quick calibration of radiocarbon dates under various calibration curves and can be used to perform Bayesian age-depth modeling.
- GeoChronR has functions to model, analyze, and visualize age-uncertain data and offers access to commonly-used age-modeling tools such as rbacon and Bchron.
- isogeochem can be used to quickly calculate isotope fractionation factors and apply paleothermometry equations.
- DAIME can be used to model the effects of changing deposition rates on geological data and rates.
Visualizing stratigraphic and sedimentological data
- StratigrapheR can be used to generate highly customizable lithologs for stratigraphic and sedimentological data from outcrop sections and borehole logs with R base graphics. SDAR can be used to make similar, albeit less customizable, graphic logs with grid graphics.
- tidypaleo provides a set of functions with a common framework for age-depth model management, stratigraphic visualization, and common statistical transformations.
- provenance provides a set of functions for the visual interpretation of large datasets in sedimentary geology, including adaptive kernel density estimation, principal component analysis, correspondence analysis, multidimensional scaling, generalised procrustes analysis and individual differences scaling
Paleoclimate
Acquiring and manipulating paleoclimate reconstruction data
- rpaleoclim can fetch paleoclimate data from PaleoClim, high resolution paleoclimate surfaces covering the whole globe. This includes data on surface temperature, precipitation and the standard bioclimatic variables commonly used in ecological modeling, derived from the HadCM3 general circulation model and downscaled to a spatial resolution of up to 2.5 minutes.
- pastclim has methods to easily access, manipulate, and use paleoclimate reconstructions.
Reconstructing and modeling paleoclimate
- crestr can be used to create probabilistic reconstructions of past climate change from fossil assemblage data.
- cRacle can be used to perform the Climate Reconstruction Analysis using Coexistence Likelihood Estimation (CRACLE) method to estimate climate and paleoclimate from vegetation using large repositories of biodiversity data.
- sedproxy can be used to conduct forward modeling of sediment archived climate proxy records based on hypothesized “true” past climate (e.g., climate model output), sedimentation, and sampling.
References
- Fenton, I.S., Woodhouse, A., Aze, T., Lazarus, D., Renaudie, J., Dunhill, A.M., Young, J.R. and Saupe, E.E., 2021. Triton, a new species-level database of Cenozoic planktonic foraminiferal occurrences. Scientific Data, 8(1), 160. doi:10.1038/s41597-021-00942-7.
- Polly, P. D., Stayton, C. T., Dumont, E. R., Pierce, S. E., Rayfield, E. J., & Angielczyk, K. D. 2016. Combining geometric morphometrics and finite element analysis with evolutionary modeling: towards a synthesis. Journal of Vertebrate Paleontology, 36(4). doi:10.1080/02724634.2016.1111225.
- Sepkoski, J. J. 2002. A compendium of fossil marine animal genera. Bulletins of American Paleontology, 363, 1–560. https://www.biodiversitylibrary.org/item/40634
- Silvestro, D., Salamin, N. and Schnitzler, J., 2014. PyRate: a new program to estimate speciation and extinction rates from incomplete fossil data. Methods in Ecology and Evolution, 5(10), pp.1126-1131. doi:10.1111/2041-210X.12263.
- Smith J., Rillo M., Kocsis A., Dornelas M., Fastovich D., Huang H.-H.M., Jonkers L., Kiessling W., Li Q., Liow L.-H., Margulis-Ohnuma M., Meyers S., Na L., Penny A.M., Pippenger K., Renaudie J., Saupe E., Steinbauer M.J., Sugawara M., Tomasovych A., Williams J., Yasuhara M., Finnegan S., Hull P.M. 2023. BioDeepTime: A database of biodiversity time series for modern and fossil assemblages. Global Ecology and Biogeography, 32(10), 1680-1689. doi:10.1111/geb.13735.
CRAN packages
Core: | None. |
Regular: | admtools, analogue, astrochron, Bchron, chronosphere, clam, CoordinateCleaner, DAIME, deeptime, divDyn, divvy, ecospace, evoTS, folio, fossil, fossilbrush, FossilSim, FossilSimShiny, GEOmap, isogeochem, Morphoscape, neotoma2, palaeoverse, paleobioDB, paleobuddy, paleotree, paleoTS, pastclim, ppgm, provenance, rbacon, rgbif, rgplates, ridigbio, rioja, rmacrostrat, rpaleoclim, rphylopic, RRphylo, SDAR, sedproxy, sepkoski, strap, StratigrapheR, StratPal, tidypaleo. |
Other resources