Getting Started with bayespmtools

What is bayespmtools?

The bayespmtools package helps you determine sample sizes for external validation studies of risk prediction models using a Bayesian approach. Unlike traditional methods that require fixed performance values, this package allows you to incorporate uncertainty about model performance into your calculations.

Why use a Bayesian approach?

Traditional sample size calculations require you to specify exact values for metrics like the c-statistic or calibration slope. But in reality, we’re uncertain about these values. The Bayesian approach in bayespmtools lets you:

Express uncertainty about model performance using probability distributions
Calculate sample sizes based on expected precision OR assurance levels
Incorporate Value of Information (VoI) analysis to assess clinical utility

Quick Example

Let’s walk through a simple example. Suppose you’re planning to externally validate a risk prediction model and you have some prior information about its likely performance.

library(bayespmtools)
set.seed(123) # Set seed for reproducibility

Step 1: Specify Your Evidence

First, define what you know (or believe) about the model’s performance using probability distributions:

evidence <- list(
  prev ~ beta(116, 155),           # Outcome prevalence
  cstat ~ beta(3628, 1139),        # C-statistic
  cal_mean ~ norm(-0.009, 0.125),  # Mean calibration error
  cal_slp ~ norm(0.995, 0.024)     # Calibration slope
)

What this means:

prev: Outcome prevalence
cstat: c-statistic (discrimination)
cal_mean: Mean calibration error (differences between average observed and expected risks)
cal_slp: Expected calibration slope (from a logistic model regressing observed outcome on logit-transformed predicted risks)

You can parameterize distributions flexibly using means and SDs, confidence interval bounds, or natural parameters.

Step 2: Define Your Targets

Next, specify what precision you want to achieve. We want to evaluate sample size size on three rules:

Targeting expected 95% CI width for c-statistic, observed-to-expected outcome ratio (cal_oe), and calibration slope (cal_slp).
Targeting 90% assurance on calibration slope. In particular, we want to be 90% confident that the calibration slope’s CI width will be no greater a maximum tolerable value.
Using Value of Information (voI) criterion for net benefit: we want this validation study to reduce net benefit loss due to uncertainty by 90%.

targets <- list(
  eciw.cstat = 0.1,             # Expected CI width for c-statistic
  eciw.cal_oe = 0.22,           # Expected CI width for O/E ratio
  eciw.cal_slp = 0.30,          # Expected CI width for calibration slope
  qciw.cal_slp = c(0.9, 0.35),  # 90% assurance that CI width ≤ 0.35
  voi.nb = 0.90
)

Step 3: Calculate Sample Size

Now run the main calculation:

results <- bpm_valsamp(
  evidence = evidence,
  targets = targets,
  n_sim = 1000,           # Number of Monte Carlo simulations
  threshold = 0.2         # Risk threshold for net benefit calculations
)

NOTE: the number of simulations (n_sim) of 1,000 is low and is used here for convenience. In practice, consider simulation sizes of at least 10,000 and consider stability of results using different random seed.

Step 4: View Results

print(results$results)
#>   eciw.cstat  eciw.cal_oe eciw.cal_slp qciw.cal_slp       voi.nb 
#>          347          430         1037          896          717

The output shows the required sample size for each criterion. The largest sample size (1037) ensures all targets are met. However, the VoI criterion indicates that a sample size of 717 is expected to reduce uncertainty-related clinical utility loss by 90%. This might be used to reason that the criteria on calibration slope can potentially be relaxed.

Next Steps

For more advanced usage, see the full tutorial vignette:

vignette("bayespmtools_tutorial")

This covers: - Working with different distribution types - Net benefit and Value of Information analysis - Precision calculations for fixed sample sizes - A real-world case study

Key Functions

bpm_valsamp(): Calculate required sample size given targets
bpm_valprec(): Calculate precision/VoI given a fixed sample size

Getting Help

For detailed documentation on any function:

?bpm_valsamp
?bpm_valprec

Visit the package repository: https://github.com/resplab/bayespmtools

References

For methodological details, see:

Sadatsafavi M, et al. (2026). Bayesian sample size considerations for external validation of risk prediction models. Statistics in Medicine. doi:10.1002/sim.70389 ```