Introduction to toxEval

22 November, 2024

The toxEval R-package includes a set of functions to analyze, visualize, and organize measured concentration data as it relates to ToxCast data (default) or other user-selected chemical-biological interaction benchmark data such as water quality criteria. The intent of these analyses is to develop a better understanding of the potential biological relevance of environmental chemistry data. Results can be used to prioritize which chemicals at which sites may be of greatest concern. These methods are meant to be used as a screening technique to predict potential for biological influence from chemicals that ultimately need to be validated with direct biological assays. Full documentation of this R package including a tutorial with examples is available here: https://doi-usgs.github.io/toxEval/index.html

The functions within this package allow great flexibly for exploring the potential biological affects of measured chemicals. Also included in the package is a browser-based application made from the Shiny R-package (the app). The app is based on functions within the R-package and includes many convenient analyses and visualization options for users to choose. Use of the functions within the R-package allows for additional flexibility within the functions beyond what the app offers and provides options for the user to interact more directly with the data. The overview in this document focuses on the R-package. Documentation for the app is provided here.

This vignette provides a general overview of the concepts within toxEval, definitions of common terminology used throughout the package, and links to information to help understand fundamentals of the ToxCast database used within toxEval.

What is ToxCast?

The U.S. EPA’s Toxicity Forecaster ToxCast includes a database of chemical-biological interactions that contains information from hundreds of assays on thousands of chemicals, providing a means to assess biological relevance to measured concentrations. The toxEval package attempts to simplify the workflow for exploring data as it relates to these assay endpoints (benchmark data). The workflow uses ToxCast as a default for evaluation of chemical:biological interactions, but the user may also define alternative benchmarks for a custom or more traditional approach to biological relevance evaluation. This is also a useful capability for efficient comparison of ToxCast evaluation results with those from other toxicity benchmark databases.

When using the ToxCast endPoints for analysis, it is important to have at least a minimal understanding of what ToxCast data is, and which ToxCast data is relevant to any given study. There are many useful resources here. There is also a tool called the Comptox Dashboard that has a wealth of information on ToxCast data.

So what are we doing with the user input data and ToxCast? First, we calculate an Exposure-Activity Ratio (EAR) for each measurement. Then we can explore the EARs based on a wide variety of groupings to view the data in many dimensions.

Exposure-Activity Ratio

An Exposure-Activity Ratio (EAR) is defined as the ratio of a measured concentration and a concentration that was determined to cause some activity in a specified ToxCast assay (“endPoint” concentration). An EAR > 1.0 would indicate that the measured concentration is greater than the endpoint concentration. The ToxCast database (as provided in the current version of toxEval) provides as many as several hundred endPoints for more than 7000 chemicals. Each endPoint is a single test that was done to detect some form of biological activity.

In order to get appropriate EAR results, it is important to use the correct units. The toxEval package assumes all measured concentrations are reported in micrograms per liter (\(\mu\)g/L). ToxCast data is reported in log(\(\mu\)M), so the toxEval package automatically performs the unit conversion.

A secondary option within toxEval is for the user to provide a set of “benchmark concentrations” to define custom biological responses to meet specific study objectives (e.g. water quality criteria). In this case, EAR values are replaced with toxicity quotients. Similar to EAR values, toxicity quotients are defined as the ratio of a measured concentration to the benchmark concentration.

EAR is basically synonymous with bioanalytical equivalent concentrations (BEQ). EAR is a ratio, and BEQ is a measure of concentration, but both convey the same information.

What is an “endPoint”?

ToxCast uses high-throughput assays to create concentration-response curves for each of these chemical: endPoint combinations. An endPoint is “associated with the perturbation of specific biological processes identified for the confirmation or monitoring of predicted site-specific hazards” Blackwell 2017. That means a specific biological action was tested, and the concentration at which activity was observed was determined. Of several endpoint values provided within the ToxCast database, the activity concentration at cutoff (ACC) was chosen to compute EAR values within the toxEval package, consistent with the description in Blackwell, 2017. ACC values from the ToxCast database are provided within the toxEval package.

What is a “group”?

Often, it is valuable to consider aggregations of single endPoints in evaluation efforts. ToxCast has provided tables that group individual endPoints into generalized categories for functional use. The grouping summary table is included in toxEval and can be explored via the end_point_info data:

library(toxEval)
end_point_info <- end_point_info

See the help file ?end_point_info for specifics on how the table was downloaded.

Throughout the toxEval analysis, there are graphing and table functions that will summarize EARs based on either “Biological” groupings (as defined by a group of endPoints) or “Chemical Class” groupings (as defined by a group of chemicals).

The default grouping of ToxCast endPoints is “intended_target_family”, but depending on the analysis, it may be more appropriate to use other grouping categories. To change the default, specify a grouping in the groupCol argument of the filter_groups function. For example:

filtered_ep <- filter_groups(end_point_info,
              groupCol = "intended_target_family",
              assays = c("ATG","NVS", "OT", "TOX21", 
                         "CEETOX", "APR", "CLD", "TANGUAY",
                         "NHEERL_PADILLA","NCCT_SIMMONS", "ACEA"),
              remove_groups = c("Background Measurement",
                                "Undefined"))

What is happening here is that the supplied data frame end_point_info is filtered to just the “intended_target_family” group. Additionally, the assay BioSeek is removed (it is the only one not included in the possible list of assays). Finally, of all the groups within intended_target_family, the end points designated “Undefined” and “Background Measurement” are removed.

Summarizing the data

The functions in toxEval summarize the data as follows:

First, individual EAR values are calculated for each chemical:endPoint combination. Then, the EAR values are summed together by samples (a sample is defined as a unique site/date) based on the grouping picked in the “category” argument. Categories include “Biological”, “Chemical Class”, or “Chemical”. “Biological” refers to the chosen ToxCast annotation as defined in the groupCol argument of the filter_groups function. “Chemical Class” refers to the groupings of chemicals as defined in the “Class” column of the “Chemicals” sheet of the input file. “Chemical” refers to the individual chemical as defined by a unique CAS value. Finally, the maximum or mean EAR is calculated per site (based on the mean_logic option). This ensures that each site is represented equally regardless of how many samples are available per site.

Some functions will also include a calculation for a “hit”. A threshold is defined by the user, and if the mean or maximum EAR (calculated as described above) is greater than the threshold, that is considered a “hit”.

Reporting bugs

If you discover an issue that you feel is a bug in the package or have a question on functionality, please consider reporting bugs and asking questions on the Issues page: https://github.com/DOI-USGS/toxEval/issues

Citing toxEval

citation(package = "toxEval")
## To cite toxEval in publications, please use:
## 
##   De Cicco, L.A., Corsi, S.R., Villeneuve D.L, Blackwell, and B.R,
##   Ankley, G.T., 2024, toxEval: Evaluation of measured concentration
##   data using the ToxCast high-throughput screening database or a
##   user-defined set of concentration benchmarks. R package version
##   1.4.0., U.S. Geological Survey software release. Reston, VA.,
##   doi:10.5066/P1CQJHJV
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     author = {Laura A. {De Cicco} and Steven R. Corsi and Daniel L. Villeneuve and Brett R. Blackwell and Gerald T. Ankley},
##     title = {toxEval: Evaluation of measured concentration data using the ToxCast high-throughput screening database or a user-defined set of concentration benchmarks.},
##     publisher = {U.S. Geological Survey},
##     version = {1.4.0},
##     address = {Reston, VA},
##     institution = {U.S. Geological Survey},
##     year = {2024},
##     doi = {10.5066/P1CQJHJV},
##     url = {https://code.usgs.gov/water/toxEval},
##   }