--- title: "Introduction to DUToolkit" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to DUToolkit} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: markdown: wrap: 72 --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` During public health crises such as the COVID-19 pandemic, decision-makers rely on models to predict and estimate the impact of various policy alternatives on health outcomes. Often, there is a high degree of uncertainty in the evidence base underpinning these models. When there is increased uncertainty, the risk of selecting a policy option that does not align with the intended policy objective also increases; we term this decision risk. Even when models adequately capture uncertainty, the tools used to communicate their outcomes, underlying uncertainty, and the associated decision risk are important to mitigate decisions to adopt sub-optimal policies and/or critical health technologies. The DUToolkit package provides a suite of tools and visualizations for the characterization, estimation, and communication of parameter uncertainty and decision risk. The package is designed to evaluate the impact of policy alternatives on outcomes compared to a pre-defined baseline scenario. The baseline scenario is typically defined as maintaining the status quo or a scenario where no mitigation policies are implemented (i.e. a ‘do nothing’ or ‘existing policy’ scenario). DUToolkit leverages model outputs from uncertainty analysis techniques, such as probabilistic sensitivity analysis, general uncertainty analysis, or Bayesian inference, to support decision-making. ## Getting started {#get-started} The DUToolkit functions fall into five main categories: - ***Calculating risk:*** includes `calculate_risk()` and `tabulate_risk()` - ***Time-outcome fan plot:*** includes `plot_fan()` and `calculate_time()` - ***Probability density plots with risk shading:*** includes `get_max_min_values()`, `plot_density()`, `calculate_threshold_probs()`, and `calculate_max_min_risk()` - ***Raincloud plots:*** includes `plot_raincloud()` - ***Temporal probability density plots:*** includes `get_relative_values()`, `plot_temporal()`, and `sum_stats_temporal()` ### Synthetic data {#syn-data} The DUToolkit package includes pre-loaded synthetic model outputs stored in the R object `psa_data`, which serve as an example dataset. This dataset represents a hypothetical scenario where a decision-maker is selecting between two policies related to COVID-19 in 2020: (i) Baseline – do nothing/current state and (ii) Intervention 1 – close schools. Each policy is expected to impact the number of individuals in the hospital. Hospital capacity has a maximum upper bound, which is the decision threshold. ### Data format {#format-data} The DUToolkit functions require model outputs from multiple simulation runs using different parameter sets (e.g., probabilistic sensitivity analysis, general uncertainty analysis, or Bayesian inference). These outputs must follow a standardized format, as follows: 1. A list of `data.frames` *(Required)* - The list must contain one `data.frame` for each policy alternative. - Each `data.frame` must have: - A **first column** representing model time, either as numeric values, (e.g., 1, 2, 3, …) or as dates in R **Date format** (e.g., 2021-01-01, 2021-01-02, ...) with class = "Date". - **Subsequent columns** containing predicted outputs for each simulation run at the corresponding time points (e.g., if there are 100 simulations, there will be 101 columns in the `data.frame`). - To ensure a consistent basis for comparison, the model time in the first column should be **identical across all policy alternatives** (i.e., the first column in every `data.frame` should contain the same values). ```{r example1} library(DUToolkit) # example data.frame with date in first column head(psa_data$Baseline[, 1:5]) ``` 2. A list of vectors containing weights *(Optional)* - Some simulation runs may be more or less likely than others. Various methods can account for this, such as calculating a log-likelihood for each simulation run and converting it into a weight. Users must choose the most appropriate method for their specific scenario. - Each **vector** in the list corresponds to a specific policy alternative and contains the **weights** assigned to each simulation run. - Each weight vector must have: - The **same number of elements** as the number of simulation run columns in the corresponding output `data.frame` (i.e., all columns **except** the first column). - The **order of weights** must match the order of simulation run columns in the corresponding `data.frame`.