--- title: "3. Ensemble vs. Best Model Selection" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{3. Ensemble vs. Best Model Selection} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` ## Introduction In the theoretical framework of Super Learning, there are two distinct ways to utilize the cross-validated risks of your base algorithms: 1. **The Ensemble Super Learner:** Calculates a weighted average (convex combination) of the base learners. This "soft selection" smooths out variance, integrates different feature spaces, and generally yields the lowest finite-sample prediction error. 2. **The "Best" Model Selector:** Identifies the single algorithm with the lowest cross-validated risk and assigns it a weight of `1.0` (and all others `0.0`). This is a "hard selection" or "winner-take-all" approach. While the Ensemble is asymptotically optimal, selecting the single best model is incredibly useful when interpretability is strictly tied to one specific algorithm family. Instead of manually cherry-picking the best model—which introduces researcher bias and invalidates post-selection inference—`SuperSurv` automates the selection using rigorous, internal cross-validation. You can easily toggle between these two paradigms using the `selection` argument. ## 1. Prepare the Data We load the `metabric` dataset and define our evaluation time grid exactly as we did in the previous tutorials. ```{r setup, message=FALSE, warning=FALSE} library(SuperSurv) library(survival) data("metabric", package = "SuperSurv") set.seed(42) # Standard Train/Test Split train_idx <- sample(1:nrow(metabric), 0.7 * nrow(metabric)) train <- metabric[train_idx, ] test <- metabric[-train_idx, ] X_tr <- train[, grep("^x", names(metabric))] X_te <- test[, grep("^x", names(metabric))] new.times <- seq(50, 200, by = 25) # Define a diverse library of base learners my_library <- c("surv.coxph", "surv.weibull", "surv.rpart") ``` ## 2. Fit Both Super Learners We will fit two separate `SuperSurv` models. The first will use the default Ensemble approach (`selection = "ensemble"`), and the second will use the winner-take-all approach (`selection = "best"`). ```{r fit-models, results='hide', message=FALSE, warning=FALSE} # 1. The Ensemble Super Learner (Weighted Average) fit_ensemble <- SuperSurv( time = train$duration, event = train$event, X = X_tr, newdata = X_te, new.times = new.times, event.library = my_library, cens.library = my_library, control = list(saveFitLibrary = TRUE), verbose = FALSE, selection = "ensemble", # <-- Calculates fractional weights nFolds = 3 ) # 2. The 'Best' Super Learner (Winner-Take-All) fit_best <- SuperSurv( time = train$duration, event = train$event, X = X_tr, newdata = X_te, new.times = new.times, event.library = my_library, cens.library = my_library, control = list(saveFitLibrary = TRUE), verbose = FALSE, selection = "best", # <-- Selects the single best model nFolds = 3 ) ``` ## 3. Inspecting the Weights The difference between the two methodologies is immediately obvious when we inspect the meta-learner coefficients (`event.coef`). ```{r inspect-weights} cat("\n--- ENSEMBLE WEIGHTS (selection = 'ensemble') ---\n") print(round(fit_ensemble$event.coef, 4)) cat("\n--- BEST MODEL WEIGHTS (selection = 'best') ---\n") print(round(fit_best$event.coef, 4)) ``` **Interpretation:** Notice that the Ensemble Super Learner distributes the weight across multiple models to minimize the overall loss function. The "Best" Super Learner simply looks at the cross-validated risk, identifies the champion, and gives it 100% of the weight. ## 4. Evaluate and Compare Performance A common question in clinical research is: *"Does the complexity of the Ensemble actually perform better than simply picking the best single model?"* We can answer this instantly using `eval_summary()`. ```{r evaluate-comparison} # Evaluate the Ensemble cat("Performance of the ENSEMBLE Super Learner:\n") eval_summary(fit_ensemble, newdata = X_te, time = test$duration, event = test$event, eval_times = new.times) cat("\nPerformance of the BEST MODEL Super Learner:\n") eval_summary(fit_best, newdata = X_te, time = test$duration, event = test$event, eval_times = new.times) ``` By comparing the resulting Brier scores and C-indices, you can empirically justify whether the "soft selection" of the ensemble is mathematically necessary for your specific dataset, or if the "hard selection" of a single model is sufficient.