--- title: "Customization" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Customization} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` `crosstable` provides a high level of customization. While the available options may not be immediately intuitive at first, they allow fine control over how summaries and effects are computed and displayed. Before exploring these options, we start by loading the package and setting a few convenient defaults. ```{r setup} library(crosstable) crosstable_options(compact=TRUE, keep_id=TRUE) ``` Customization in `crosstable` mainly happens at three levels: - how numerical variables are summarized (`funs`) - how group effects are computed (`effect_args`) - how statistical tests are computed (`test_args`) Summary functions (`funs`) describe each group separately, whereas `effect_args` and `test_args` control how groups are compared. `effect_args` controls the estimated effect size and confidence interval, whereas `test_args` controls the hypothesis test and p-value. ## Numeric variables: the `funs` argument Numeric variables are summarized using a set of summary functions. By default, `crosstable` reports: `min/max`, `median/IQR`, `mean/sd` and `number of observations/missing`. These summaries are generated by the internal function `cross_summary()`. These summaries can be customized depending on how you want numeric variables to be reported. In practice, you will often want all numeric variables to be summarized in the same way. For this reason, it is convenient to define `funs` globally with `crosstable_options()`, although you can also pass it directly to `crosstable()`. The first possibility is to use a named list of functions. If a function returns multiple values (as with `quantile()`), the names of the returned statistics are automatically combined. ```{r numerics} crosstable_options(funs=c("mean"=mean, "std dev"=sd, qtl=~quantile(.x, prob=c(0.25, 0.75)))) crosstable(mtcars2, mpg) %>% as_flextable() ``` Another option is to provide a custom summary function that returns several statistics at once. In this case, you should give the function an empty name (`" "`) so that its internal labels are used directly. ```{r numerics2} f = function(x) c("Mean (SD)"=meansd(x), "Med [IQR]"=mediqr(x)) crosstable(mtcars2, wt, funs=f) %>% as_flextable() crosstable(mtcars2, wt, funs=c(" "=f)) %>% as_flextable() ``` To this end, crosstable exports convenience functions: `meansd()`, `meanCI()`, `mediqr()`, `minmax()`, and `nna()`. ## Calculating effects When `effect = TRUE`, `crosstable` computes an effect comparing the levels of the `by` variable. Effect calculation is controlled by the `effect_args` argument, which defaults to the result of `crosstable_effect_args()`. The function used for actual calculation depends on the type of variable being analyzed: - `effect_summarize` for numeric variables - `effect_tabular` for categorical variables - `effect_survival` for survival outcomes By default, `effect_tabular` is set to `effect_odds_ratio()`, which computes an odds ratio for categorical variables. ```{r effect-default} mtcars2 %>% crosstable(am, by=vs, effect=TRUE) %>% as_flextable() ``` Suppose that instead of an odds ratio, you want to compute a **difference in proportions**. To define a custom categorical effect, you need to write a function that takes: - `x`: the variable being summarized - `by`: the grouping variable - `conf.level`: the confidence level and returns a list with the following elements: - `summary`: a data frame containing the effect label, estimate, and confidence interval. It can contain several rows if `x` has more than two levels. - `effect.type`: the name of the effect being computed - `ref`: the reference level or comparison label The following example computes a difference in proportions and uses `prop.test()` to derive the confidence interval. ```{r effect-custom} ct_effect_prop_diff = function(x, by, conf.level){ tb = table(x, by) test = prop.test(tb, conf.level=conf.level) nms = dimnames(tb)[["x"]] effect = diff(test$estimate) effect.type = "Difference of proportions" reference = glue::glue(", {nms[1]} vs {nms[2]}") summary = data.frame(name = "Proportion difference", effect, ci_inf = test$conf.int[1], ci_sup = test$conf.int[2]) list(summary = summary, ref = reference, effect.type = effect.type) } my_effect_args = crosstable_effect_args(effect_tabular=ct_effect_prop_diff) # crosstable_options(effect_args=my_effect_args) #set globally if desired mtcars2 %>% crosstable(am, by=vs, effect=TRUE, effect_args=my_effect_args) %>% as_flextable() ``` The same general approach can be used to define custom effects for numeric and survival variables. Several alternative effect functions are already implemented in `crosstable`. See `?effect_summary`, `?effect_tabular`, and `?effect_survival` for available options. ## Calculating tests Customizing statistical tests is even simpler. A custom test function only needs to return a list with two elements: - `p.value`: the p-value - `method`: the label displayed for the test For example, the following function replaces the default numeric test with a linear model. In a two-group setting, this is close in spirit to a classical comparison test, but it illustrates how custom testing logic can be integrated into `crosstable`. ```{r test-custom} ct_test_lm = function(x, by){ fit = lm(x ~ by) pval = anova(fit)$`Pr(>F)`[1] list(p.value = pval, method = "Linear model ANOVA") } my_test_args = crosstable_test_args(test_summarize=ct_test_lm) # crosstable_options(test_args=my_test_args) #set globally if desired mtcars2 %>% crosstable(mpg, by=vs, test=TRUE, test_args=my_test_args) %>% as_flextable() ```