--- title: "Model Comparison" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Model Comparison} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE, dpi = 150 ) # Use ragg for better font rendering if available if (requireNamespace("ragg", quietly = TRUE)) { knitr::opts_chunk$set(dev = "ragg_png") } old_opts <- options(width = 180) ``` Model selection is a fundamental problem in applied statistics. Given a set of candidate models, it is important to determine which specification best balances goodness-of-fit against parsimony. Traditional approaches rely on information criteria (AIC, BIC), discrimination metrics (C-statistic), and hypothesis tests for nested models. The `compfit()` function synthesizes these metrics into a weighted Composite Model Score (CMS) to facilitate systematic comparison between models. Like other functions in this package, it follows the standard `summata` input paradigm, with an additional parameter to allow for multiple models: ```{r, eval = FALSE} compfit(data, outcome, model_list, model_type, ...) ``` where `data` is the dataset, `outcome` is the common endpoint variable, `model_list` is a named list of predictor vectors (each representing a different model to be compared), and `model_type` is the type of model to be implemented. This vignette demonstrates the various capabilities of this function using the included sample dataset. --- # Preliminaries The examples in this vignette use the `clintrial` dataset included with `summata`: ```{r setup} library(summata) data(clintrial) data(clintrial_labels) ``` --- # Quality Metrics and the Composite Model Score The `compfit()` function evaluates models using several quality metrics, then combines them into a single Composite Model Score (CMS) ranging from 0 to 100 for rapid comparison. The metrics available depend on the model type. ## Available Metrics | Metric | Used by | Interpretation | Better | |:-------|:--------|:---------------|:-------| | AIC | All | Information criterion balancing fit and complexity | Lower | | BIC | Fallback | Information criterion with stronger complexity penalty | Lower | | Concordance | GLM, Cox, GLMER, Cox ME | Discrimination ability (0.5 = random, 1.0 = perfect) | Higher | | Pseudo-*R*² | GLM, Cox ME | Proportion of variation explained (McFadden's) | Higher | | *R*² | LM | Proportion of variance explained | Higher | | Brier Score | GLM | Prediction accuracy for binary outcomes (0 = perfect) | Lower | | Global *p* | Cox | Omnibus likelihood ratio test *p*-value | Lower | | RMSE | LM | Root mean squared error of residuals | Lower | | Marginal *R*² | LMER, GLMER | Variance explained by fixed effects alone | Higher | | Conditional *R*² | LMER | Variance explained by fixed + random effects | Higher | | ICC | LMER, GLMER, Cox ME | Intraclass correlation; proportion of variance from clustering | Moderate | | Convergence | All | Whether the model converged successfully | Yes | ## CMS Interpretation | Range | Interpretation | |:------|:---------------| | 85–100 | Excellent | | 75–84 | Very Good | | 65–74 | Good | | 55–64 | Fair | | < 55 | Poor | ## Default CMS Weights by Model Type The score components and their default weights vary by model type. | Component | LM | GLM | Cox | LMER | GLMER | Cox ME | |:----------|:--:|:---:|:---:|:----:|:-----:|:------:| | Convergence | 15% | 15% | 15% | 20% | 15% | 15% | | AIC | 25% | 25% | 30% | 25% | 25% | 30% | | *R*² | 45% | | | | | | | Pseudo-*R*² | | 15% | | | | 10% | | Concordance | | 40% | 40% | | 30% | 35% | | Brier Score | | 5% | | | | | | Global *p* | | | 15% | | | | | RMSE | 15% | | | | | | | Marginal *R*² | | | | 25% | 15% | | | Conditional *R*² | | | | 15% | | | | ICC | | | | 15% | 15% | 10% | --- # Basic Usage ## **Example 1:** Nested Model Comparison Compare models with increasing complexity: ```{r} example1 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Demographics" = c("age", "sex"), "Plus Stage" = c("age", "sex", "stage", "ecog"), "Full Model" = c("age", "sex", "stage", "ecog", "treatment", "smoking") ), model_type = "glm" ) example1 ``` ## **Example 2:** Linear Regression Models For continuous outcomes, use `model_type = "lm"`: ```{r} example2 <- compfit( data = clintrial, outcome = "los_days", model_list = list( "Simple" = c("age", "sex"), "Disease" = c("age", "sex", "stage", "ecog"), "Treatment" = c("age", "sex", "stage", "ecog", "surgery", "treatment") ), model_type = "lm" ) example2 ``` ## **Example 3:** Cox Regression Models For time-to-event outcomes, use `model_type = "coxph"`: ```{r} example3 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Unadjusted" = c("treatment"), "Demographics" = c("treatment", "age", "sex"), "Full" = c("treatment", "age", "sex", "stage", "ecog") ), model_type = "coxph" ) example3 ``` ## **Example 4:** Count Models For count outcomes, use `model_type = "glm"` with a count model family (e.g., `family = "poisson"`): ```{r} example4 <- compfit( data = clintrial, outcome = "fu_count", model_list = list( "Minimal" = c("age", "treatment"), "Clinical" = c("age", "treatment", "stage", "ecog"), "Full" = c("age", "treatment", "stage", "ecog", "surgery", "diabetes") ), model_type = "glm", family = "poisson", labels = clintrial_labels ) example4 ``` --- # Interaction Testing A key application of `compfit()` is testing whether interaction terms improve model fit. ## **Example 5:** Single Interaction Compare a main-effects model to one with an interaction term: ```{r} example5 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Main Effects" = c("age", "treatment", "sex"), "With Interaction" = c("age", "treatment", "sex") ), interactions_list = list( NULL, c("sex:treatment") ), model_type = "glm" ) example5 ``` ## **Example 6:** Multiple Interactions Compare different interaction hypotheses: ```{r} example6 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Main Effects" = c("age", "treatment", "sex", "stage"), "Age × Treatment" = c("age", "treatment", "sex", "stage"), "Sex × Treatment" = c("age", "treatment", "sex", "stage"), "Both" = c("age", "treatment", "sex", "stage") ), interactions_list = list( NULL, c("age:treatment"), c("sex:treatment"), c("age:treatment", "sex:treatment") ), model_type = "coxph" ) example6 ``` --- # Accessing Detailed Results ## **Example 7:** Coefficient Comparison Setting `include_coefficients = TRUE` generates a table comparing coefficients across models: ```{r} example7 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Model A" = c("age", "sex"), "Model B" = c("age", "sex", "stage"), "Model C" = c("age", "sex", "stage", "treatment") ), model_type = "glm", include_coefficients = TRUE, labels = clintrial_labels ) # Main comparison example7 # Coefficient table coef_table <- attr(example7, "coefficients") coef_table ``` ## **Example 8:** Fitted Model Objects The underlying model objects are stored as attributes for further analysis: ```{r} models <- attr(example7, "models") names(models) # Examine a specific model summary(models[["Model C"]]) ``` ## **Example 9:** Recommended Model The best-performing model is identified automatically based on the Composite Model Score (CMS): ```{r} example9 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Minimal" = c("treatment"), "Standard" = c("treatment", "age", "sex", "stage"), "Extended" = c("treatment", "age", "sex", "stage", "ecog", "grade") ), model_type = "coxph" ) recommended <- attr(example9, "best_model") cat("Recommended model:", recommended, "\n") ``` --- # Custom CMS Weights ## **Example 10:** Custom Weights Modify the CMS scoring weights to emphasize specific metrics: ```{r} example10 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Simple" = c("age", "sex"), "Standard" = c("age", "sex", "stage"), "Full" = c("age", "sex", "stage", "treatment", "ecog") ), model_type = "glm", scoring_weights = list( convergence = 0.05, aic = 0.20, concordance = 0.60, pseudo_r2 = 0.10, brier = 0.05 ) ) example10 ``` --- # Application Scenarios ## **Scenario 1:** Confounder Assessment Evaluate the impact of covariate adjustment on effect estimates: ```{r} scenario1 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Crude" = c("treatment"), "Age-Sex Adjusted" = c("treatment", "age", "sex"), "Fully Adjusted" = c("treatment", "age", "sex", "stage", "ecog") ), model_type = "coxph", include_coefficients = TRUE, labels = clintrial_labels ) scenario1 # Compare treatment effect across models attr(scenario1, "coefficients") ``` ## **Scenario 2:** Variable Selection Validation Compare automated versus theory-driven selection: ```{r} # Identify candidates via screening screening <- uniscreen( data = clintrial, outcome = "surgery", predictors = c("age", "sex", "bmi", "smoking", "diabetes", "hypertension", "stage", "ecog", "treatment"), model_type = "glm", p_threshold = 0.10 ) # Extract significant predictors sig_vars <- attr(screening, "significant") scenario2 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Theory-Driven" = c("age", "sex", "stage", "treatment"), "Data-Driven" = sig_vars, "Combined" = unique(c("age", "sex", "stage", "treatment", sig_vars)) ), model_type = "glm" ) scenario2 ``` ## **Scenario 3:** Parsimony Assessment Test whether additional predictors meaningfully improve fit: ```{r} scenario3 <- compfit( data = clintrial, outcome = "los_days", model_list = list( "3 Predictors" = c("age", "surgery", "ecog"), "5 Predictors" = c("age", "surgery", "ecog", "stage", "treatment"), "8 Predictors" = c("age", "surgery", "ecog", "stage", "treatment", "sex", "smoking", "diabetes") ), model_type = "lm", labels = clintrial_labels ) scenario3 ``` --- # Exporting Results Comparison tables can be exported to various formats: ```{r, eval = FALSE} # Main comparison table table2docx( table = example1, file = file.path(tempdir(), "Model_Comparison.docx"), caption = "Table 3. Model Comparison Results" ) # Coefficient table table2docx( table = attr(example6, "coefficients"), file = file.path(tempdir(), "Coefficient_Comparison.docx"), caption = "Table S1. Coefficient Estimates Across Models" ) # PDF with landscape orientation for wide tables table2pdf( table = example1, file = file.path(tempdir(), "Model_Comparison.pdf"), caption = "Model Comparison", orientation = "landscape" ) ``` --- # Best Practices ## When to Compare Models 1. **Covariate selection**: Determine appropriate adjustment level 2. **Interaction testing**: Evaluate effect modification 3. **Parsimony**: Balance complexity against fit 4. **Sensitivity analysis**: Compare different specifications ## Interpreting Results 1. **Multiple metrics**: The Composite Model Score (CMS) provides a summary, but examine individual metrics 2. **Practical significance**: Statistical improvement may not translate to meaningful differences 3. **Sample size**: Complex models require larger samples 4. **Convergence**: Non-converged models should be interpreted cautiously ## Limitations 1. The Composite Model Score (CMS) is a heuristic, not a formal statistical test 2. Comparisons assume models are fit on identical observations 3. Information criteria are most meaningful for nested models 4. Small score differences may not be practically significant --- # Common Issues ## Non-Convergence Check convergence status and consider simplifying non-converging models: ```{r, eval = FALSE} # Check convergence status comparison[, .(Model, Converged)] # For non-converging models: # 1. Reduce complexity # 2. Check for separation (logistic) # 3. Examine predictor correlations ``` When encountering non-converging models, consider reducing complexity, examining predictor correlations, and checking for perfect separation (primarily in logistic models). ## Differing Sample Sizes Models with missing data may have different sample sizes, which complicates comparison: ```{r, eval = FALSE} # Check sample sizes comparison[, .(Model, N, Events)] # Use complete cases for fair comparison complete_data <- na.omit(data[, relevant_vars, with = FALSE]) ``` ## Interpreting Close Scores When scores are similar, prefer parsimony and consider domain knowledge: ```{r, eval = FALSE} # Examine individual metrics comparison[, .(Model, `Composite Model Score (CMS)`, AIC, Concordance)] # Prefer parsimony when scores are close # Consider interpretability ``` ```{r, include = FALSE} options(old_opts) ``` --- # Further Reading - [Descriptive Tables](descriptive_tables.html): `desctable()` for baseline characteristics - [Regression Modeling](regression_modeling.html): `fit()`, `uniscreen()`, and `fullfit()` - [Table Export](table_export.html): Export to PDF, Word, and other formats - [Forest Plots](forest_plots.html): Visualization of regression results - [Multivariate Regression](multivariate_regression.html): `multifit()` for multi-outcome analysis - [Advanced Workflows](advanced_workflows.html): Interactions and mixed-effects models