--- title: "7. Parametric Survival Models" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{7. Parametric Survival Models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` ## Introduction While the Cox model is semi-parametric (it leaves the baseline hazard unspecified), fully parametric models assume that survival times follow a specific mathematical distribution, such as the Weibull, Exponential, or Log-Normal distribution. Parametric models are statistically powerful because they produce perfectly smooth survival curves. However, they are highly brittle: if you assume the data follows a Weibull distribution, but the true biological hazard has a completely different shape, the model will be heavily biased. `SuperSurv` acts as a **safety net**. You can include multiple parametric assumptions in your library. If a parametric assumption perfectly matches your data, `SuperSurv` will give it a high weight. If the assumption is wrong, the cross-validation risk will spike, and `SuperSurv` will safely assign it a weight of zero. ## 1. Setup and Library Definition ```{r setup, message=FALSE, warning=FALSE} library(SuperSurv) library(survival) data("metabric", package = "SuperSurv") set.seed(42) train_idx <- sample(1:nrow(metabric), 0.7 * nrow(metabric)) train <- metabric[train_idx, ] test <- metabric[-train_idx, ] X_tr <- train[, grep("^x", names(metabric))] X_te <- test[, grep("^x", names(metabric))] new.times <- seq(50, 200, by = 25) # Define a library covering different parametric assumptions parametric_library <- c("surv.coxph", # Semi-parametric baseline "surv.weibull", # Assumes hazard increases/decreases monotonically "surv.exponential", # Assumes constant hazard over time "surv.lognormal") # Assumes hazard rises then falls ``` ## 2. Fitting the Parametric Ensemble We run the ensemble exactly as before. Internally, `SuperSurv` will fit these Accelerated Failure Time (AFT) models and map their continuous survival predictions onto our discrete `new.times` evaluation grid. ```{r fit-parametric, results='hide', message=FALSE, warning=FALSE} fit_parametric <- SuperSurv( time = train$duration, event = train$event, X = X_tr, newdata = X_te, new.times = new.times, event.library = parametric_library, cens.library = c("surv.coxph"), control = list(saveFitLibrary = TRUE), verbose = FALSE, selection = "ensemble", nFolds = 3 ) ``` ## 3. Evaluating the "Safety Net" Let's look at the cross-validated risks and the final meta-learner weights. ```{r evaluate-parametric} cat("Cross-Validated Risks (Lower is better):\n") print(round(fit_parametric$event.cvRisks, 4)) cat("\nFinal Ensemble Weights:\n") print(round(fit_parametric$event.coef, 4)) ``` ### Interpretation Look closely at the weights assigned to `surv.exponential`. The Exponential distribution assumes that the risk of the event (hazard) is completely constant over time. In real-world cancer datasets like `metabric`, this assumption is almost always false (risk usually increases with time or peaks shortly after surgery). Because the Exponential assumption fits the data poorly, its cross-validated risk will be high, and `SuperSurv` will smartly assign it a weight of $0.00$. By including parametric models in your `SuperSurv` library, you allow the data—not the researcher—to dictate which mathematical distributions are actually appropriate for your patient cohort!