--- title: "Using Custom Outcome Models in gfoRmula" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Using Custom Outcome Models in gfoRmula} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} urlcolor: blue --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` By default, the \verb|gfoRmula| package uses a pooled logistic regression model for survival outcomes, logistic regression model for binary end-of-follow-up outcomes, and a linear regression model for continuous end-of-follow-up outcomes. Starting from version 1.1.0, the \verb|gfoRmula| package allows users to apply their own type of outcome models. This document describes how to specify such custom outcome models. This document assumes that readers have read the long-form package documentation of [McGrath et al. (2020)](https://doi.org/10.1016/j.patter.2020.100008). ## Specifying custom outcome models To specify custom outcome models, users must provide functions that fit the outcome model and obtain estimates from the fitted model through the parameters \verb|ymodel_fit_custom| and \verb|ymodel_predict_custom|, respectively, in the \verb|gformula| function. The function for fitting the outcome model must take the parameters \verb|ymodel| and \verb|obs_data|. Below, we illustrate a function for fitting an outcome model using a random forest. This code uses the \verb|randomForest| package. ```{r} ymodel_fit_custom <- function(ymodel, obs_data){ return(randomForest::randomForest(formula = ymodel, data = obs_data)) } ``` The function for obtaining estimates from the model must take the parameters \verb|fit| (the fitted outcome model) and \verb|newdf| (a \verb|data.table| containing the simulated dataset at time $t$). This function must return the estimated probability of the outcome for survival and binary end-of-follow-up outcomes or the estimated mean of the outcome for continuous end-of-follow-up outcomes in \verb|newdf|. Continuing with the random forest example, the code below obtains the estimated outcome mean for a continuous end-of-follow-up outcome. This code leverages the \verb|predict.randomForest| function in the \verb|randomForest| package. ```{r} ymodel_predict_custom <- function(fit, newdf){ return(as.numeric(predict(object = fit, newdata = newdf))) } ``` ## Example We perform an analysis similar to that Example 3 in [McGrath et al. (2020)](https://doi.org/10.1016/j.patter.2020.100008), except we use the custom outcome model from the previous section. ```{r, echo=FALSE} library('gfoRmula') library('data.table') ``` ```{r} library('Hmisc') id <- 'id' time_name <- 't0' covnames <- c('L1', 'L2', 'A') outcome_name <- 'Y' outcome_type <- 'continuous_eof' covtypes <- c('categorical', 'normal', 'binary') histories <- c(lagged) histvars <- list(c('A', 'L1', 'L2')) covparams <- list(covmodels = c(L1 ~ lag1_A + lag1_L1 + L3 + t0 + rcspline.eval(lag1_L2, knots = c(-1, 0, 1)), L2 ~ lag1_A + L1 + lag1_L1 + lag1_L2 + L3 + t0, A ~ lag1_A + L1 + L2 + lag1_L1 + lag1_L2 + L3 + t0)) ymodel <- Y ~ A + L1 + L2 + lag1_A + lag1_L1 + lag1_L2 + L3 intervention1.A <- list(static, rep(0, 7)) intervention2.A <- list(static, rep(1, 7)) int_descript <- c('Never treat', 'Always treat') nsimul <- 10000 gform_cont_eof <- gformula(obs_data = continuous_eofdata, id = id, time_name = time_name, covnames = covnames, outcome_name = outcome_name, outcome_type = outcome_type, covtypes = covtypes, covparams = covparams, ymodel = ymodel, ymodel_fit_custom = ymodel_fit_custom, ymodel_predict_custom = ymodel_predict_custom, intervention1.A = intervention1.A, intervention2.A = intervention2.A, int_descript = int_descript, histories = histories, histvars = histvars, basecovs = c("L3"), nsimul = nsimul, seed = 1234) gform_cont_eof ``` ## References McGrath S, Lin V, Zhang Z, Petito LC, Logan RW, HernĂ¡n MA, Young JG. gfoRmula: an R package for estimating the effects of sustained treatment strategies via the parametric g-formula. Patterns. 2020 Jun 12;1(3).