--- title: "Introduction to bmco" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to bmco} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo=FALSE} biblio <- RefManageR::ReadBib("../inst/REFERENCES.bib", check = "error") ``` ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` ## Overview The **bmco** package provides Bayesian methods for analyzing multiple binary outcomes simultaneously. It addresses a common problem in clinical trials, educational research, and other fields: how to compare groups when success is defined by multiple endpoints. ### The Problem Suppose you're evaluating a new drug and want to know if it improves **both** symptom relief and quality of life. `bmco` provides a principled Bayesian framework that: - Analyzes **multiple binary outcomes jointly** - Accounts for **correlations between outcomes** - Offers **flexible decision rules** (All, Any, or weighted combinations) - Provides **posterior probabilities** - Estimates **subgroup effects** using data from the entire sample - Handles **clustered data** ## Quick Start Example ```{r setup} library(bmco) set.seed(2024) # Simulate a clinical trial with two binary outcomes trial_data <- data.frame( treatment = rep(c("placebo", "drug"), each = 50), symptom_relief = rbinom(100, 1, prob = rep(c(0.4, 0.6), each = 50)), quality_of_life = rbinom(100, 1, prob = rep(c(0.5, 0.7), each = 50)) ) # Test if drug improves BOTH outcomes result <- bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), test = "right_sided", rule = "All", n_it = 1000 ) print(result) summary(result) ``` The posterior probability (`pop`) of this example tells us: **"What is the probability that the drug is better than placebo on BOTH outcomes?"** ## Three Analysis Functions The package provides three main functions for different scenarios: ### 1. `bmvb()`: Basic Comparison **Use when**: Comparing two groups on multiple binary outcomes. Currently supported for two outcomes only. ```{r bmvb_example, eval=TRUE} # Compare two teaching methods on two outcomes bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), rule = "All", n_it = 1000 ) ``` For more details on statistical methodology or when using this function, please cite: `r capture.output(print(biblio["Kavelaars2020"]))` ### 2. `bglm()`: Subgroup analysis **Use when**: Comparing two subgroups on multiple binary outcomes (e.g., age range, particular baseline severity). Currently supported for two outcomes only. ```{r bglm_example, eval=TRUE} # Add age covariate trial_data$age <- rnorm(100, mean = 50, sd = 10) result_bglm <- bglm( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", x_var = "age", y_vars = c("symptom_relief", "quality_of_life"), x_method = "Empirical", x_def = c(-Inf, Inf), rule = "All", n_burn = 200, n_it = 500 ) print(result_bglm) summary(result_bglm) ``` For more details on statistical methodology or when using this function, please cite: `r capture.output(print(biblio["Kavelaars2024"]))` Also see the "Subgroup analysis" vignette for details. ### 3. `bglmm()`: With Clustering **Use when**: Comparing two (sub)groups on multiple binary outcomes, when data are clustered (patients within hospitals, students within schools). ```{r bglmm_example, eval=TRUE} # Add cluster variable trial_data$hospital <- rep(1:10, each = 10) result_bglmm <- suppressWarnings( # Warnings due to small number of iterations suppressed bglmm( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", id_var = "hospital", x_var = "age", y_vars = c("symptom_relief", "quality_of_life"), x_method = "Empirical", x_def = c(-Inf, Inf), rule = "All", n_burn = 200, n_it = 500, n_thin = 3 ) ) print(result_bglmm) summary(result_bglmm) ``` For more details on statistical methodology or when using this function, please cite: `r capture.output(print(biblio["Kavelaars2023"]))` Also see the "Multilevel Models" vignette for details. ## Decision Rules Choose how to define "treatment success": ### All Rule (Conjunctive) Treatment must improve **ALL** outcomes: ```{r rule_all} result_all <- bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), rule = "All", # BOTH outcomes must improve n_it = 5000 ) cat("P(drug better on BOTH outcomes):", result_all$delta$pop, "\n") ``` ### Any Rule (Disjunctive) Treatment must improve **AT LEAST ONE** outcome: ```{r rule_any} result_any <- bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), rule = "Any", # At least one outcome must improve n_it = 5000 ) cat("P(drug better on AT LEAST ONE outcome):", result_any$delta$pop, "\n") ``` ### Compensatory Rule (Weighted) Weight outcomes by importance: ```{r rule_comp} result_comp <- bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), rule = "Comp", w = c(0.7, 0.3), # Symptom relief weighted 70%, QoL 30% n_it = 5000 ) cat("P(weighted improvement):", result_comp$delta$pop, "\n") cat("Weighted effect:", result_comp$delta$w_delta, "\n") ``` ## Test Directions ### Right-sided Test Test if group B is better than group A: ```{r test_right} bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", # Put the group you expect to be better here y_vars = c("symptom_relief", "quality_of_life"), test = "right_sided", # P(drug > placebo) n_it = 5000 ) ``` ### Left-sided Test Test if group A is better than group B: ```{r test_left} bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), test = "left_sided", # P(placebo > drug) n_it = 5000 ) ``` **Note**: Left-sided with (A, B) is equivalent to right-sided with (B, A). ## Understanding the Output ```{r output_structure} result <- bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), n_it = 5000 ) # Key components names(result) # Posterior estimates for each group result$estimates # Sample sizes result$sample_sizes # Treatment effect result$delta # Test information result$info ``` ### Key Output Elements - `estimates$mean_a` / `mean_b`: Posterior mean success probabilities for each group - `estimates$sd_a` / `sd_b`: Posterior standard deviations - `delta$mean_delta`: Mean treatment effect (difference in probabilities) - `delta$se_delta`: Standard error of treatment effect - `delta$pop`: **Posterior probability** (the main result!) - `info$test_label`: Human-readable description of the test ## Posterior Samples For custom analyses, extract posterior samples: ```{r samples} result_samples <- bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), n_it = 5000, return_samples = TRUE ) # Access posterior samples str(result_samples$samples) # Custom probability calculations samples <- result_samples$samples$delta # P(effect on symptom relief > 0.1) cat("P(delta[symptom] > 0.1):", mean(samples[, 1] > 0.1), "\n") # P(effect on QoL > 0.15) cat("P(delta[QoL] > 0.15):", mean(samples[, 2] > 0.15), "\n") # Credible intervals cat("\n95% Credible Intervals:\n") print(apply(samples, 2, quantile, probs = c(0.025, 0.975))) ``` ## Practical Considerations ### Sample Size For basic comparison: see @Kavelaars2020. For subgroup analysis: see @Kavelaars2024. ### MCMC Settings Default settings usually work, but you can adjust: ```{r mcmc_settings, eval=TRUE} bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), n_it = 10000 # More iterations for smoother estimates ) ``` For `bglm()` and `bglmm()`: ```{r mcmc_settings_reg, eval = FALSE} bglm( # ... arguments ... n_burn = 2000, # Burn-in / warm-up iterations n_it = 5000, # Sampling iterations n_thin = 1 # Keep every nth sample (reduces memory) ) ``` ### Missing Data Missing values are automatically handled via complete-case analysis: ```{r missing_data, eval=TRUE} # Introduce missing data trial_data_missing <- trial_data trial_data_missing$symptom_relief[1:5] <- NA result_missing <- bmvb( data = trial_data_missing, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), n_it = 1000 ) # Sample sizes are reduced result_missing$sample_sizes ``` ## Comparison with Frequentist Approaches ### Traditional approach ```{r frequentist, eval=TRUE} # Separate tests for each outcome chisq_symptom <- chisq.test(table(trial_data$treatment, trial_data$symptom_relief)) chisq_qol <- chisq.test(table(trial_data$treatment, trial_data$quality_of_life)) cat("Frequentist p-values:\n") cat(" Symptom relief:", chisq_symptom$p.value, "\n") cat(" Quality of life:", chisq_qol$p.value, "\n") cat(" Bonferroni-adjusted alpha: 0.025 (needed for Any-rule only; not needed for All-rule)\n\n") ``` ### Bayesian approach ```{r bayesian, eval=TRUE} result_bayes <- bmvb( data = trial_data, grp = "treatment", grp_a = "placebo", grp_b = "drug", y_vars = c("symptom_relief", "quality_of_life"), rule = "All", n_it = 1000 ) cat("Bayesian result:\n") cat(" P(drug better on BOTH):", result_bayes$delta$pop, "\n") ``` ### Advantages of Bayesian approach 1. **Direct probability statements**: "95% probability drug is better" vs. "reject null at alpha=0.05" 2. **Flexible decision rules**: All/Any/Weighted combinations 4. **Coherent uncertainty**: Full sample of posterior distribution available ## Next Steps Explore the other vignettes: - **Subgroup analysis**: Using `bglm()` to perform subgroup analysis - **Multilevel Models**: Using `bglmm()` for clustered data ## References **Statistical methodology**: More details about statistical methodology can be found here in the papers below. When using the `bmvb()` function, please cite: `r capture.output(print(biblio["Kavelaars2020"], .opts = list(style = "markdown", bib.style = "authoryear", sorting = "ydnt")))` When using the `bglm()` function, please cite: `r capture.output(print(biblio["Kavelaars2024"], .opts = list(style = "markdown", bib.style = "authoryear", sorting = "ydnt")))` When using the `bglmm()` function, please cite: `r capture.output(print(biblio["Kavelaars2023"], .opts = list(style = "markdown", bib.style = "authoryear", sorting = "ydnt")))`