The GammaFrailty package implements univariate gamma frailty regression models for survival data under six different baseline distributions:
| Model | Parameter(s) | Reference |
|---|---|---|
| Arvind | alpha | Pandey et al. (2024) |
| Lindley | lambda | Lindley (1958) |
| Linear Failure Rate (LFR) | a, b | Bain (1974) |
| Power Xgamma (PXG) | alpha, beta | Tyagi et al. (2022) |
| Modified Topp-Leone (MTL) | alpha | Singh et al. (2025) |
| Power Failure Rate (PFR) | a, k | Mugdadi (2005) |
Let \(T_j\) be the survival time for the \(j\)-th subject and \(Z\) be an unobservable frailty variable drawn from a Gamma distribution with mean 1 and variance \(\theta\). Conditional on \(Z = z\), the hazard is
\[h(t \mid z) = z \, h_0(t) \, e^{\mathbf{X}_j \boldsymbol{\beta}}.\]
After marginalising over \(Z\), the unconditional survival function becomes
\[S(t_j) = \left[1 + \theta \, \eta_j \, H_0(t_j)\right]^{-1/\theta},\]
where \(\eta_j = e^{\mathbf{X}_j \boldsymbol{\beta}}\) and \(H_0(t)\) is the baseline cumulative hazard. When no covariates are present, \(\eta_j = 1\).
# Install from local source
install.packages(
"path/to/GammaFrailty",
repos = NULL, type = "source"
)Each baseline distribution has its own r_* function, and the full frailty model generator is r_gamma_frailty().
library(GammaFrailty)
set.seed(42)
# Arvind (alpha = 0.5)
x_arvind <- r_arvind(200, alpha = 0.5)
# Lindley (lambda = 1.5)
x_lindley <- r_lindley(200, lambda = 1.5)
# LFR (a = 0.5, b = 0.2)
x_lfr <- r_lfr(200, a = 0.5, b = 0.2)
# Power Xgamma (alpha = 1, beta = 0.8)
x_pxg <- r_pxg(100, alpha = 1.0, beta = 0.8)
# Modified Topp-Leone (alpha = 2)
x_mtl <- r_mtl(100, alpha = 2.0)
# Power Failure Rate (a = 0.5, k = 1)
x_pfr <- r_pfr(200, a = 0.5, k = 1.0)
summary(x_arvind)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.0004777 0.4437149 0.9271125 0.9702745 1.3520352 2.7002231plot_baseline("arvind", par = 0.5, t_range = c(0.01, 3))r_gamma_frailty() supports eight censoring mechanisms.
set.seed(1)
# Right-censored data with one covariate
dat_right <- r_gamma_frailty(
n = 150,
baseline = "arvind",
par = 0.5,
x = matrix(rnorm(150), ncol = 1),
beta = 0.5,
theta = 0.3,
cen_type = "right",
cen_rate = 0.25
)
colnames(dat_right)[4] <- "X1"
cat("Censoring rate:", mean(dat_right$status == 0), "\n")
#> Censoring rate: 0.2666667
head(dat_right)
#> time time2 status X1
#> 1 1.0266130 NA 0 -0.6264538
#> 2 0.1640531 NA 1 0.1836433
#> 3 1.1684013 NA 1 -0.8356286
#> 4 0.1800794 NA 0 1.5952808
#> 5 0.5857507 NA 1 0.3295078
#> 6 2.9474645 NA 0 -0.8204684set.seed(2)
# Interval-censored, no covariates
dat_int <- r_gamma_frailty(
n = 100,
baseline = "lfr",
par = c(0.5, 0.2),
theta = 0.4,
cen_type = "interval",
int_width = 0.4
)
table(dat_int$status)
#>
#> 0 1 3
#> 5 65 30set.seed(3)
dat_complete <- r_gamma_frailty(120, "arvind", par = 0.5, theta = 0.3,
cen_type = "none")
fit0 <- fit_gamma_frailty(dat_complete$time, dat_complete$status,
baseline = "arvind")
summary(fit0)
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Arvind
#> Sample size : 120
#> No. of covariates : 0
#> No. of baseline pars : 1
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> alpha 0.5239 0.08116 6.456 2.494e-09 0.38673 0.7098 ***
#> theta 0.1869 0.11192 1.670 0.0976 0.05779 0.6044 .
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.1869 SE = 0.1119
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -114.7298
#> AIC : 233.4596
#> BIC : 239.0346fit1 <- fit_gamma_frailty(dat_right$time, dat_right$status,
x = dat_right[, "X1", drop = FALSE],
baseline = "arvind")
summary(fit1)
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Arvind
#> Sample size : 150
#> No. of covariates : 1
#> No. of baseline pars : 1
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> alpha 0.4355 0.06959 6.259 4.024e-09 0.31843 0.5957 ***
#> X1 0.3633 0.13627 2.666 0.008536 0.09620 0.6304 **
#> theta 0.1996 0.15315 1.303 0.194491 0.04437 0.8980
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.1996 SE = 0.1531
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -121.3515
#> AIC : 248.7030
#> BIC : 257.7349
#>
#> ----- Overall F-test (covariates) ------------------------
#> F-stat = 7.1073, p-value = 0.008536library(survival)
fit_formula <- gamma_frailty(
Surv(time, status) ~ X1,
data = dat_right,
baseline = "arvind"
)
summary(fit_formula)
#> Call:
#> gamma_frailty(formula = Surv(time, status) ~ X1, data = dat_right,
#> baseline = "arvind")
#>
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Arvind
#> Sample size : 150
#> No. of covariates : 1
#> No. of baseline pars : 1
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> alpha 0.4355 0.06959 6.259 4.024e-09 0.31843 0.5957 ***
#> X1 0.3633 0.13627 2.666 0.008536 0.09620 0.6304 **
#> theta 0.1996 0.15315 1.303 0.194491 0.04437 0.8980
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.1996 SE = 0.1531
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -121.3515
#> AIC : 248.7030
#> BIC : 257.7349
#>
#> ----- Overall F-test (covariates) ------------------------
#> F-stat = 7.1073, p-value = 0.008536# Coefficients with CI
coef(fit1)
#> alpha X1 theta
#> 0.4355310 0.3632854 0.1996087
confint(fit1, level = 0.95)
#> 2.5 % 97.5 %
#> alpha 0.31843347 0.5956888
#> X1 0.09620445 0.6303664
#> theta 0.04437056 0.8979744
# Log-likelihood and information criteria
logLik(fit1)
#> 'log Lik.' -121.3515 (df=3)
AIC(fit1)
#> [1] 248.703
cat("BIC:", fit1$BIC, "\n")
#> BIC: 257.7349
cat("Frailty variance (theta):", fit1$theta,
" SE:", fit1$theta_se, "\n")
#> Frailty variance (theta): 0.1996087 SE: 0.1531498# Generate data with 2 covariates to illustrate VIF
set.seed(4)
dat2 <- r_gamma_frailty(150, "arvind", par = 0.5,
x = matrix(rnorm(300), ncol = 2),
beta = c(0.4, -0.3), theta = 0.3,
cen_type = "right")
fit2 <- fit_gamma_frailty(dat2$time, dat2$status,
x = dat2[, 4:5], baseline = "arvind")
cat("VIF:\n"); print(fit2$VIF)
#> VIF:
#> X1 X2
#> 1.014884 1.014884
cat("Tolerance:\n"); print(fit2$Tolerance)
#> Tolerance:
#> X1 X2
#> 0.9853344 0.9853344res <- residuals_frailty(fit1)
cat("MSE:", round(res$MSE, 4),
" RMSE:", round(res$RMSE, 4),
" MAE:", round(res$MAE, 4), "\n")
#> MSE: 0.0144 RMSE: 0.1199 MAE: 0.0953
cat("R-squared:", round(res$R_square, 4),
" Adj R-sq:", round(res$Adj_R_square, 4), "\n")
#> R-squared: 0.9238 Adj R-sq: 0.9233
cat("KS test p-value:", round(res$KS_pvalue, 4), "\n")
#> KS test p-value: 5e-04dt <- diagnostics_table(fit1)
head(dt[, c("time","status","deviance","leverage","cooks_dist","DFFITS")])
#> time status deviance leverage cooks_dist DFFITS
#> 1 1.0266130 0 -1.1135094 0.0032198000 0.0040180734 -0.06338827
#> 2 0.1640531 1 1.7572670 0.0002766949 0.0008549032 0.02923873
#> 3 1.1684013 1 0.3463552 0.0057289812 0.0006952026 0.02636669
#> 4 0.1800794 0 -0.5612617 0.0208797391 0.0068609440 -0.08283082
#> 5 0.5857507 1 0.7812591 0.0008908039 0.0005446863 0.02333851
#> 6 2.9474645 0 -2.2837808 0.0055229930 0.0291268709 -0.17066596oldpar <- par(mfrow = c(2, 2))
plot_residuals_fitted(fit1)
plot_qq_residuals(fit1)
plot_scale_location(fit1)
plot_residuals_leverage(fit1)par(oldpar)plot_survival_km(fit1)plot_coef_forest(fit1)plot_leverage(fit1)plot_dfbetas(fit1)# Survival at specific time points
survival_at(fit1, times = c(0.5, 1.0, 2.0)) |> head(6)
#> subject time survival
#> 1 1 0.5 0.7882833
#> 2 2 0.5 0.7284945
#> 3 3 0.5 0.8018229
#> 4 4 0.5 0.5955107
#> 5 5 0.5 0.7164460
#> 6 6 0.5 0.8008667# Median survival times
med <- predict_frailty(fit1, type = "median")
summary(med)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.5060 0.8209 0.9654 0.9678 1.1008 1.6110# Risk scores (eta = exp(X * beta))
risk <- predict_frailty(fit1, type = "risk")
summary(risk)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.4473 0.8085 0.9822 1.0640 1.2408 2.3928# Population-averaged (marginal) survival
t_grid <- seq(0.1, 3, by = 0.2)
ms <- predict_frailty(fit1, newtime = t_grid, type = "marginal")
plot(t_grid, ms, type = "l", lwd = 2, col = "steelblue",
xlab = "Time", ylab = "Marginal S(t)", main = "Marginal Survival")# Expected survival drop in [0.5, 1.5]
es <- predict_frailty(fit1, type = "expected", window = c(0.5, 1.5))
summary(es)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.3373 0.4439 0.4678 0.4574 0.4814 0.4869# Forecast beyond training range
fc <- forecast_frailty(fit1, horizon = 6, n_grid = 100)
plot(fc$time, fc$survival[1, ], type = "l", lwd = 2, col = "firebrick",
xlab = "Time", ylab = "S(t)", main = "Forecast: Subject 1")set.seed(5)
dat_cmp <- r_gamma_frailty(120, "arvind", par = 0.5, theta = 0.3,
cen_type = "right", cen_rate = 0.2)
fits <- list(
Arvind = fit_gamma_frailty(dat_cmp$time, dat_cmp$status, baseline = "arvind"),
Lindley = fit_gamma_frailty(dat_cmp$time, dat_cmp$status, baseline = "lindley"),
LFR = fit_gamma_frailty(dat_cmp$time, dat_cmp$status, baseline = "lfr"),
PFR = fit_gamma_frailty(dat_cmp$time, dat_cmp$status, baseline = "pfr")
)
compare_models(fits)
#> Model Baseline n K logLik AIC BIC theta
#> 1 Arvind Arvind 120 2 -110.6038 225.2076 230.7825 0.3111
#> 2 Lindley Lindley 120 2 -111.6868 227.3736 232.9485 0.0000
#> 3 LFR Linear Failure Rate 120 3 -110.4537 226.9073 235.2698 0.4076
#> 4 PFR Power Failure Rate 120 3 -109.4395 224.8790 233.2415 0.0265
#> delta_AIC AIC_weight
#> 1 0.3286 0.3396
#> 2 2.4946 0.1150
#> 3 2.0283 0.1452
#> 4 0.0000 0.4002cv_res <- cv_frailty(dat_cmp$time, dat_cmp$status,
baseline = "arvind", k = 5)
cat("Mean OOS log-lik:", round(cv_res$mean_oos_loglik, 3), "\n")
cat("Mean OOS RMSE: ", round(cv_res$mean_oos_rmse, 3), "\n")set.seed(6)
dat_left <- r_gamma_frailty(100, "pfr", par = c(0.5, 1), theta = 0.35,
cen_type = "left", left_threshold = 0.3)
table(dat_left$status)
#>
#> 1
#> 100
fit_left <- fit_gamma_frailty(dat_left$time, dat_left$status,
baseline = "pfr")
summary(fit_left)
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Power Failure Rate
#> Sample size : 100
#> No. of covariates : 0
#> No. of baseline pars : 2
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> a 0.4691 0.1088 4.309 3.93e-05 0.2976 0.7392 ***
#> k 1.1820 0.4107 2.878 0.004926 0.5982 2.3355 **
#> theta 0.6224 0.4167 1.494 0.138509 0.1676 2.3117
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.6224 SE = 0.4167
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -166.1406
#> AIC : 338.2813
#> BIC : 346.0968fit_int <- fit_gamma_frailty(dat_int$time, dat_int$status,
time2 = dat_int$time2,
baseline = "lfr")
summary(fit_int)
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Linear Failure Rate
#> Sample size : 100
#> No. of covariates : 0
#> No. of baseline pars : 2
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> a 0.571170 0.09355 6.10562 2.114e-08 4.143e-01 7.874e-01 ***
#> b 0.004588 0.09047 0.05071 0.9597 7.512e-20 2.802e+14
#> theta 0.043597 0.33317 0.13085 0.8962 1.363e-08 1.395e+05
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.0436 SE = 0.3332
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -178.4396
#> AIC : 362.8793
#> BIC : 370.6948set.seed(8)
dat_t1 <- r_gamma_frailty(100, "arvind", par = 0.5, theta = 0.3,
cen_type = "type1", cen_time = 2.0)
table(dat_t1$status) # 0 = censored at time 2.0
#>
#> 0 1
#> 16 84
fit_t1 <- fit_gamma_frailty(dat_t1$time, dat_t1$status, baseline = "arvind")
summary(fit_t1)
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Arvind
#> Sample size : 100
#> No. of covariates : 0
#> No. of baseline pars : 1
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> alpha 0.5358 0.1065 5.030 2.22e-06 0.3629 0.7911 ***
#> theta 0.4131 0.2229 1.853 0.06685 0.1435 1.1894 .
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.4131 SE = 0.2229
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -93.9247
#> AIC : 191.8494
#> BIC : 197.0598set.seed(9)
dat_t2 <- r_gamma_frailty(100, "pfr", par = c(0.5, 1), theta = 0.3,
cen_type = "type2", r_failures = 70L)
table(dat_t2$status) # exactly 70 status=1 events
#>
#> 0 1
#> 30 70
fit_t2 <- fit_gamma_frailty(dat_t2$time, dat_t2$status, baseline = "pfr")
summary(fit_t2)
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Power Failure Rate
#> Sample size : 100
#> No. of covariates : 0
#> No. of baseline pars : 2
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> a 0.6541 0.2292 2.8535 0.005287 0.32910 1.300 **
#> k 1.2254 0.4607 2.6596 0.009153 0.58645 2.561 **
#> theta 0.6738 0.7601 0.8864 0.377581 0.07383 6.149
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.6738 SE = 0.7601
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -112.5174
#> AIC : 231.0349
#> BIC : 238.8504set.seed(10)
dat_pt1 <- r_gamma_frailty(120, "lindley", par = 1.5, theta = 0.4,
cen_type = "progressive_type1",
prog_times = c(0.5, 1.0, 1.5),
prog_scheme = c(5L, 5L, 5L))
table(dat_pt1$status) # 0 = withdrawn at tau_j, 1 = exact failure
#>
#> 0 1
#> 15 105
fit_pt1 <- fit_gamma_frailty(dat_pt1$time, dat_pt1$status,
baseline = "lindley")
summary(fit_pt1)
#> =========================================================
#> Gamma Frailty Regression Model
#> =========================================================
#> Baseline Distribution : Lindley
#> Sample size : 120
#> No. of covariates : 0
#> No. of baseline pars : 1
#>
#> ----- Parameter Estimates --------------------------------
#> Estimate StdErr t_stat p_value CI_lower CI_upper Signif
#> lambda 1.5467 0.1589 9.735 <2e-16 1.26467 1.8917 ***
#> theta 0.1154 0.1109 1.040 0.3003 0.01753 0.7592
#>
#> ----- Frailty Variance -----------------------------------
#> theta (frailty var) = 0.1154 SE = 0.1109
#>
#> ----- Model Fit ------------------------------------------
#> Log-Likelihood : -104.5526
#> AIC : 213.1053
#> BIC : 218.6803set.seed(99)
sim_res <- simulation_study(
n_sim = 200,
n_vec = c(50, 100, 200),
baseline = "arvind",
par = 0.5,
beta = 0.4,
theta = 0.3,
cen_type = "right",
cen_rate = 0.2,
verbose = TRUE
)Lindley, D. V. (1958). Fiducial distributions and Bayes’ theorem. Journal of the Royal Statistical Society. Series B (Methodological), 102-107.
Mugdadi, A. R. (2005). The least squares type estimation of the parameters in the power hazard function. Applied mathematics and computation, 169(2), 737-748.
Bain, L. J. (1974). Analysis for the linear failure-rate life-testing distribution. Technometrics, 16(4), 551-559.
Singh, B., Tyagi, S., Singh, R. P., & Tyagi, A. (2025). Modified Topp-Leone distribution: properties, classical and Bayesian estimation with application to COVID-19 and reliability data. Thailand Statistician, 23(1), 72-96.
Pandey, A., Singh, R. P., Tyagi, S., & Tyagi, A. (2024). Modelling climate, COVID-19, and reliability data: A new continuous lifetime model under different methods of estimation. Stat. Appl., 22(2).
Tyagi, S., Kumar, S., Pandey, A., Saha, M., & Bagariya, H. Power xgamma distribution: Properties and its applications to cancer data. Int J Stat Reliab Eng. 2022; 9(1): 51-60.