Title: | Presentation-Ready Data Summary and Analytic Result Tables |
Version: | 2.3.0 |
Description: | Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers. |
License: | MIT + file LICENSE |
URL: | https://github.com/ddsjoberg/gtsummary, https://www.danieldsjoberg.com/gtsummary/ |
BugReports: | https://github.com/ddsjoberg/gtsummary/issues |
Depends: | R (≥ 4.2) |
Imports: | cards (≥ 0.6.1), cli (≥ 3.6.3), dplyr (≥ 1.1.3), glue (≥ 1.8.0), gt (≥ 0.11.1), lifecycle (≥ 1.0.3), rlang (≥ 1.1.1), tidyr (≥ 1.3.0), vctrs (≥ 0.6.4) |
Suggests: | aod (≥ 1.3.3), broom (≥ 1.0.5), broom.helpers (≥ 1.20.0), broom.mixed (≥ 0.2.9), car (≥ 3.0-11), cardx (≥ 0.2.5), cmprsk, effectsize (≥ 0.6.0), emmeans (≥ 1.7.3), flextable (≥ 0.8.1), geepack (≥ 1.3.10), ggstats (≥ 0.2.1), huxtable (≥ 5.4.0), insight (≥ 0.15.0), kableExtra (≥ 1.3.4), knitr (≥ 1.37), lme4 (≥ 1.1-31), mice (≥ 3.10.0), nnet, officer, openxlsx, parameters (≥ 0.20.2), parsnip (≥ 0.1.7), rmarkdown, smd (≥ 0.6.6), spelling, survey (≥ 4.2), survival (≥ 3.6-4), testthat (≥ 3.2.0), withr (≥ 2.5.0), workflows (≥ 0.2.4) |
VignetteBuilder: | knitr |
Config/Needs/check: | hms |
Config/Needs/website: | forcats, sandwich, scales |
Config/testthat/edition: | 3 |
Config/testthat/parallel: | true |
Encoding: | UTF-8 |
Language: | en-US |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-07-03 17:05:11 UTC; sjobergd |
Author: | Daniel D. Sjoberg |
Maintainer: | Daniel D. Sjoberg <danield.sjoberg@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-07-03 17:50:02 UTC |
gtsummary: Presentation-Ready Data Summary and Analytic Result Tables
Description
Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.
Author(s)
Maintainer: Daniel D. Sjoberg danield.sjoberg@gmail.com (ORCID)
Authors:
Joseph Larmarange (ORCID)
Michael Curry (ORCID)
Emily de la Rua (ORCID)
Jessica Lavery (ORCID)
Karissa Whiting (ORCID)
Emily C. Zabor (ORCID)
Other contributors:
Xing Bai [contributor]
Esther Drill (ORCID) [contributor]
Jessica Flynn (ORCID) [contributor]
Margie Hannum (ORCID) [contributor]
Stephanie Lobaugh [contributor]
Shannon Pileggi (ORCID) [contributor]
Amy Tin (ORCID) [contributor]
Gustavo Zapata Wainberg (ORCID) [contributor]
See Also
Useful links:
Report bugs at https://github.com/ddsjoberg/gtsummary/issues
Create gtsummary table
Description
USE as_gtsummary()
INSTEAD!
This function ingests a data frame and adds the infrastructure around it
to make it a gtsummary object.
Usage
.create_gtsummary_object(table_body, ...)
Arguments
table_body |
( |
... |
other objects that will be added to the gtsummary object list |
Details
Function uses table_body
to create a gtsummary object
Value
gtsummary object
Convert Named List to Table Body
Description
Many arguments in 'gtsummary' accept named lists. This function converts
a named list to the .$table_body
format expected in scope_table_body()
Usage
.list2tb(x, colname = caller_arg(x))
Arguments
x |
named list |
colname |
string of column name to assign. Default is |
Value
.$table_body
data frame
Examples
type <- list(age = "continuous", response = "dichotomous")
gtsummary:::.list2tb(type, "var_type")
Object Convert Helper
Description
Ahead of a gtsummary object being converted to an output type,
each logical expression saved in x$table_styling
is converted to a
list of row numbers.
Usage
.table_styling_expr_to_row_number(x)
Arguments
x |
a gtsummary object |
Value
a gtsummary object
Examples
tbl <-
trial %>%
tbl_summary(include = c(age, grade)) %>%
.table_styling_expr_to_row_number()
Add CI Column
Description
Add a new column with the confidence intervals for proportions, means, etc.
Usage
add_ci(x, ...)
## S3 method for class 'tbl_summary'
add_ci(
x,
method = list(all_continuous() ~ "t.test", all_categorical() ~ "wilson"),
include = everything(),
statistic = list(all_continuous() ~ "{conf.low}, {conf.high}", all_categorical() ~
"{conf.low}%, {conf.high}%"),
conf.level = 0.95,
style_fun = list(all_continuous() ~ label_style_sigfig(), all_categorical() ~
label_style_sigfig(scale = 100)),
pattern = NULL,
...
)
Arguments
x |
( |
... |
These dots are for future extensions and must be empty. |
method |
( |
include |
( |
statistic |
( |
conf.level |
(scalar |
style_fun |
( |
pattern |
( |
Value
gtsummary table
method argument
Must be one of
-
"wilson"
,"wilson.no.correct"
calculated viaprop.test(correct = c(TRUE, FALSE))
for categorical variables -
"exact"
calculated viastats::binom.test()
for categorical variables -
"wald"
,"wald.no.correct"
calculated viacardx::proportion_ci_wald(correct = c(TRUE, FALSE)
for categorical variables -
"agresti.coull"
calculated viacardx::proportion_ci_agresti_coull()
for categorical variables -
"jeffreys"
calculated viacardx::proportion_ci_jeffreys()
for categorical variables -
"t.test"
calculated viastats::t.test()
for continuous variables -
"wilcox.test"
calculated viastats::wilcox.test()
for continuous variables
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(
missing = "no",
statistic = all_continuous() ~ "{mean} ({sd})",
include = c(marker, response, trt)
) |>
add_ci()
# Example 2 ----------------------------------
trial |>
select(response, grade) %>%
tbl_summary(
statistic = all_categorical() ~ "{p}%",
missing = "no",
include = c(response, grade)
) |>
add_ci(pattern = "{stat} ({ci})") |>
remove_footnote_header(everything())
Add CI Column
Description
Add a new column with the confidence intervals for proportions, means, etc.
Usage
## S3 method for class 'tbl_svysummary'
add_ci(
x,
method = list(all_continuous() ~ "svymean", all_categorical() ~ "svyprop.logit"),
include = everything(),
statistic = list(all_continuous() ~ "{conf.low}, {conf.high}", all_categorical() ~
"{conf.low}%, {conf.high}%"),
conf.level = 0.95,
style_fun = list(all_continuous() ~ label_style_sigfig(), all_categorical() ~
label_style_sigfig(scale = 100)),
pattern = NULL,
df = survey::degf(x$inputs$data),
...
)
Arguments
x |
( |
method |
( |
include |
( |
statistic |
( |
conf.level |
(scalar |
style_fun |
( |
pattern |
( |
df |
( |
... |
These dots are for future extensions and must be empty. |
Value
gtsummary table
method argument
Must be one of
-
"svyprop.logit"
,"svyprop.likelihood"
,"svyprop.asin"
,"svyprop.beta"
,"svyprop.mean"
,"svyprop.xlogit"
calculated viasurvey::svyciprop()
for categorical variables -
"svymean"
calculated viasurvey::svymean()
for continuous variables -
"svymedian.mean"
,"svymedian.beta"
,"svymedian.xlogit"
,"svymedian.asin"
,"svymedian.score"
calculated viasurvey::svyquantile(quantiles = 0.5)
for continuous variables
Examples
data(api, package = "survey")
survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) |>
tbl_svysummary(
by = "both",
include = c(api00, stype),
statistic = all_continuous() ~ "{mean} ({sd})"
) |>
add_stat_label() |>
add_ci(pattern = "{stat} (95% CI {ci})") |>
modify_header(all_stat_cols() ~ "**{level}**") |>
modify_spanning_header(all_stat_cols() ~ "**Survived**")
Add differences
Description
Usage
add_difference(x, ...)
Arguments
x |
( |
... |
Passed to other methods. |
Author(s)
Daniel D. Sjoberg
Add differences between groups
Description
Adds difference to tables created by tbl_summary()
.
The difference between two groups (typically mean or rate difference) is added
to the table along with the difference's confidence interval and a p-value (when applicable).
Usage
## S3 method for class 'tbl_summary'
add_difference(
x,
test = NULL,
group = NULL,
adj.vars = NULL,
test.args = NULL,
conf.level = 0.95,
include = everything(),
pvalue_fun = label_style_pvalue(digits = 1),
estimate_fun = list(c(all_continuous(), all_categorical(FALSE)) ~ label_style_sigfig(),
all_dichotomous() ~ label_style_sigfig(scale = 100, suffix = "%"), all_tests("smd")
~ label_style_sigfig()),
...
)
Arguments
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
group |
( |
adj.vars |
( |
test.args |
( |
conf.level |
( |
include |
( |
pvalue_fun |
( |
estimate_fun |
( |
... |
These dots are for future extensions and must be empty. |
Value
a gtsummary table of class "tbl_summary"
Examples
# Example 1 ----------------------------------
trial |>
select(trt, age, marker, response, death) %>%
tbl_summary(
by = trt,
statistic =
list(
all_continuous() ~ "{mean} ({sd})",
all_dichotomous() ~ "{p}%"
),
missing = "no"
) |>
add_n() |>
add_difference()
# Example 2 ----------------------------------
# ANCOVA adjusted for grade and stage
trial |>
select(trt, age, marker, grade, stage) %>%
tbl_summary(
by = trt,
statistic = list(all_continuous() ~ "{mean} ({sd})"),
missing = "no",
include = c(age, marker, trt)
) |>
add_n() |>
add_difference(adj.vars = c(grade, stage))
Add differences between groups
Description
Adds difference to tables created by tbl_summary()
.
The difference between two groups (typically mean or rate difference) is added
to the table along with the difference's confidence interval and a p-value (when applicable).
Usage
## S3 method for class 'tbl_svysummary'
add_difference(
x,
test = NULL,
group = NULL,
adj.vars = NULL,
test.args = NULL,
conf.level = 0.95,
include = everything(),
pvalue_fun = label_style_pvalue(digits = 1),
estimate_fun = list(c(all_continuous(), all_categorical(FALSE)) ~ label_style_sigfig(),
all_dichotomous() ~ label_style_sigfig(scale = 100, suffix = "%"), all_tests("smd")
~ label_style_sigfig()),
...
)
Arguments
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
group |
( |
adj.vars |
( |
test.args |
( |
conf.level |
( |
include |
( |
pvalue_fun |
( |
estimate_fun |
( |
... |
These dots are for future extensions and must be empty. |
Value
a gtsummary table of class "tbl_summary"
Examples
Add difference rows
Description
Usage
add_difference_row(x, ...)
Arguments
x |
( |
... |
Passed to other methods. |
Author(s)
Daniel D. Sjoberg
Add differences rows between groups
Description
Adds difference to tables created by tbl_summary()
as additional rows.
This function is often useful when there are more than two groups to compare.
Pairwise differences are calculated relative to the specified
by
variable's specified reference level.
Usage
## S3 method for class 'tbl_summary'
add_difference_row(
x,
reference,
statistic = everything() ~ "{estimate}",
test = NULL,
group = NULL,
header = NULL,
adj.vars = NULL,
test.args = NULL,
conf.level = 0.95,
include = everything(),
pvalue_fun = label_style_pvalue(digits = 1),
estimate_fun = list(c(all_continuous(), all_categorical(FALSE)) ~ label_style_sigfig(),
all_dichotomous() ~ label_style_sigfig(scale = 100, suffix = "%"), all_tests("smd")
~ label_style_sigfig()),
...
)
Arguments
x |
( |
reference |
(scalar) |
statistic |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
group |
( |
header |
( |
adj.vars |
( |
test.args |
( |
conf.level |
( |
include |
( |
pvalue_fun |
( |
estimate_fun |
( |
... |
These dots are for future extensions and must be empty. |
Details
The default labels for the statistic rows will often not be what you need
to display. In cases like this, use modify_table_body()
to directly
update the label rows. Use show_header_names()
to print the underlying
column names to identify the columns to target when changing the label,
which in this case will always be the 'label'
column.
See Example 2.
Value
a gtsummary table of class "tbl_summary"
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(
by = grade,
include = c(age, response),
missing = "no",
statistic = all_continuous() ~ "{mean} ({sd})"
) |>
add_stat_label() |>
add_difference_row(
reference = "I",
statistic = everything() ~ c("{estimate}", "{conf.low}, {conf.high}", "{p.value}")
)
# Example 2 ----------------------------------
# Function to build age-adjusted logistic regression and put results in ARD format
ard_odds_ratio <- \(data, variable, by, ...) {
cardx::construct_model(
data = data,
formula = reformulate(response = variable, termlabels = c(by, "age")), # adjusting model for age
method = "glm",
method.args = list(family = binomial)
) |>
cardx::ard_regression_basic(exponentiate = TRUE) |>
dplyr::filter(.data$variable == .env$by)
}
trial |>
tbl_summary(by = trt, include = response, missing = "no") |>
add_stat_label() |>
add_difference_row(
reference = "Drug A",
statistic = everything() ~ c("{estimate}", "{conf.low}, {conf.high}", "{p.value}"),
test = everything() ~ ard_odds_ratio,
estimate_fun = everything() ~ label_style_ratio()
) |>
# change the default label for the 'Odds Ratio'
modify_table_body(
~ .x |>
dplyr::mutate(
label = ifelse(label == "Coefficient", "Odds Ratio", label)
)
) |>
# add footnote about logistic regression
modify_footnote_body(
footnote = "Age-adjusted logistic regression model",
column = "label",
rows = variable == "response-row_difference"
)
Add model statistics
Description
Add model statistics returned from broom::glance()
. Statistics can either
be appended to the table (add_glance_table()
), or added as a
table source note (add_glance_source_note()
).
Usage
add_glance_table(
x,
include = everything(),
label = NULL,
fmt_fun = list(everything() ~ label_style_sigfig(digits = 3), any_of("p.value") ~
label_style_pvalue(digits = 1), c(where(is.integer), starts_with("df")) ~
label_style_number()),
glance_fun = glance_fun_s3(x$inputs$x)
)
add_glance_source_note(
x,
include = everything(),
label = NULL,
fmt_fun = list(everything() ~ label_style_sigfig(digits = 3), any_of("p.value") ~
label_style_pvalue(digits = 1), c(where(is.integer), starts_with("df")) ~
label_style_number()),
glance_fun = glance_fun_s3(x$inputs$x),
text_interpret = c("md", "html"),
sep1 = " = ",
sep2 = "; "
)
Arguments
x |
( |
include |
( |
label |
( |
fmt_fun |
( |
glance_fun |
( |
text_interpret |
( |
sep1 |
( |
sep2 |
( |
Value
gtsummary table
Tips
When combining add_glance_table()
with tbl_merge()
, the
ordering of the model terms and the glance statistics may become jumbled.
To re-order the rows with glance statistics on bottom, use the script below:
tbl_merge(list(tbl1, tbl2)) |> modify_table_body(~.x |> dplyr::arrange(row_type == "glance_statistic"))
Examples
mod <- lm(age ~ marker + grade, trial) |> tbl_regression()
# Example 1 ----------------------------------
mod |>
add_glance_table(
label = list(sigma = "\U03C3"),
include = c(r.squared, AIC, sigma)
)
# Example 2 ----------------------------------
mod |>
add_glance_source_note(
label = list(sigma = "\U03C3"),
include = c(r.squared, AIC, sigma)
)
Add the global p-values
Description
This function uses car::Anova()
(by default) to calculate global p-values
for model covariates.
Output from tbl_regression
and tbl_uvregression
objects supported.
Usage
add_global_p(x, ...)
## S3 method for class 'tbl_regression'
add_global_p(
x,
include = everything(),
keep = FALSE,
anova_fun = global_pvalue_fun,
type = "III",
quiet,
...
)
## S3 method for class 'tbl_uvregression'
add_global_p(
x,
include = everything(),
keep = FALSE,
anova_fun = global_pvalue_fun,
type = "III",
quiet,
...
)
Arguments
x |
( |
... |
Additional arguments to be passed to |
include |
( |
keep |
(scalar |
anova_fun |
( To pass a custom function, it must accept as its first argument is a model.
Note that anything passed in |
type |
Type argument passed to |
quiet |
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
lm(marker ~ age + grade, trial) |>
tbl_regression() |>
add_global_p()
# Example 2 ----------------------------------
trial[c("response", "age", "trt", "grade")] |>
tbl_uvregression(
method = glm,
y = response,
method.args = list(family = binomial),
exponentiate = TRUE
) |>
add_global_p()
Add column with N
Description
Usage
add_n(x, ...)
Arguments
x |
( |
... |
Passed to other methods. |
Author(s)
Daniel D. Sjoberg
See Also
Add N
Description
For each survfit()
object summarized with tbl_survfit()
this function
will add the total number of observations in a new column.
Usage
## S3 method for class 'tbl_survfit'
add_n(x, ...)
Arguments
x |
object of class " |
... |
Not used |
Examples
library(survival)
fit1 <- survfit(Surv(ttdeath, death) ~ 1, trial)
fit2 <- survfit(Surv(ttdeath, death) ~ trt, trial)
# Example 1 ----------------------------------
list(fit1, fit2) |>
tbl_survfit(times = c(12, 24)) |>
add_n()
Add N to regression table
Description
Add N to regression table
Usage
## S3 method for class 'tbl_regression'
add_n(x, location = "label", ...)
## S3 method for class 'tbl_uvregression'
add_n(x, location = "label", ...)
Arguments
x |
( |
location |
( When |
... |
These dots are for future extensions and must be empty. |
Examples
# Example 1 ----------------------------------
trial |>
select(response, age, grade) |>
tbl_uvregression(
y = response,
exponentiate = TRUE,
method = glm,
method.args = list(family = binomial),
hide_n = TRUE
) |>
add_n(location = "label")
# Example 2 ----------------------------------
glm(response ~ age + grade, trial, family = binomial) |>
tbl_regression(exponentiate = TRUE) |>
add_n(location = "level")
Add column with N
Description
For each variable in a tbl_summary
table, the add_n
function adds a column with the
total number of non-missing (or missing) observations
Usage
## S3 method for class 'tbl_summary'
add_n(
x,
statistic = "{N_nonmiss}",
col_label = "**N**",
footnote = FALSE,
last = FALSE,
...
)
## S3 method for class 'tbl_svysummary'
add_n(
x,
statistic = "{N_nonmiss}",
col_label = "**N**",
footnote = FALSE,
last = FALSE,
...
)
## S3 method for class 'tbl_likert'
add_n(
x,
statistic = "{N_nonmiss}",
col_label = "**N**",
footnote = FALSE,
last = FALSE,
...
)
Arguments
x |
( |
statistic |
(
The argument uses |
col_label |
( |
footnote |
(scalar |
last |
(scalar |
... |
These dots are for future extensions and must be empty. |
Value
A table of class c('tbl_summary', 'gtsummary')
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(by = trt, include = c(trt, age, grade, response)) |>
add_n()
# Example 2 ----------------------------------
survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |>
tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age)) |>
add_n()
Add event N
Description
For each survfit()
object summarized with tbl_survfit()
this function
will add the total number of events observed in a new column.
Usage
## S3 method for class 'tbl_survfit'
add_nevent(x, ...)
Arguments
x |
object of class 'tbl_survfit' |
... |
Not used |
See Also
Other tbl_survfit tools:
add_p.tbl_survfit()
Examples
library(survival)
fit1 <- survfit(Surv(ttdeath, death) ~ 1, trial)
fit2 <- survfit(Surv(ttdeath, death) ~ trt, trial)
# Example 1 ----------------------------------
list(fit1, fit2) |>
tbl_survfit(times = c(12, 24)) |>
add_n() |>
add_nevent()
Add event N
Description
Add event N
Usage
add_nevent(x, ...)
## S3 method for class 'tbl_regression'
add_nevent(x, location = "label", ...)
## S3 method for class 'tbl_uvregression'
add_nevent(x, location = "label", ...)
Arguments
x |
( |
... |
These dots are for future extensions and must be empty. |
location |
( When |
Examples
# Example 1 ----------------------------------
trial |>
select(response, trt, grade) |>
tbl_uvregression(
y = response,
exponentiate = TRUE,
method = glm,
method.args = list(family = binomial),
) |>
add_nevent()
# Example 2 ----------------------------------
glm(response ~ age + grade, trial, family = binomial) |>
tbl_regression(exponentiate = TRUE) |>
add_nevent(location = "level")
Add overall column
Description
Adds a column with overall summary statistics to tables
created by tbl_summary()
, tbl_svysummary()
, tbl_continuous()
or
tbl_custom_summary()
.
Usage
add_overall(x, ...)
## S3 method for class 'tbl_summary'
add_overall(
x,
last = FALSE,
col_label = "**Overall** \nN = {style_number(N)}",
statistic = NULL,
digits = NULL,
...
)
## S3 method for class 'tbl_continuous'
add_overall(
x,
last = FALSE,
col_label = "**Overall** \nN = {style_number(N)}",
statistic = NULL,
digits = NULL,
...
)
## S3 method for class 'tbl_svysummary'
add_overall(
x,
last = FALSE,
col_label = "**Overall** \nN = {style_number(N)}",
statistic = NULL,
digits = NULL,
...
)
## S3 method for class 'tbl_custom_summary'
add_overall(
x,
last = FALSE,
col_label = "**Overall** \nN = {style_number(N)}",
statistic = NULL,
digits = NULL,
...
)
## S3 method for class 'tbl_hierarchical'
add_overall(
x,
last = FALSE,
col_label = "**Overall** \nN = {style_number(N)}",
statistic = NULL,
digits = NULL,
...
)
## S3 method for class 'tbl_hierarchical_count'
add_overall(
x,
last = FALSE,
col_label = ifelse(rlang::is_empty(x$inputs$denominator), "**Overall**",
"**Overall** \nN = {style_number(N)}"),
statistic = NULL,
digits = NULL,
...
)
Arguments
x |
( |
... |
These dots are for future extensions and must be empty. |
last |
(scalar |
col_label |
( |
statistic |
( |
digits |
( |
Value
A gtsummary
of same class as x
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(include = c(age, grade), by = trt) |>
add_overall()
# Example 2 ----------------------------------
trial |>
tbl_summary(
include = grade,
by = trt,
percent = "row",
statistic = ~"{p}%",
digits = ~1
) |>
add_overall(
last = TRUE,
statistic = ~"{p}% (n={n})",
digits = ~ c(1, 0)
)
# Example 3 ----------------------------------
trial |>
tbl_continuous(
variable = age,
by = trt,
include = grade
) |>
add_overall(last = TRUE)
ARD add overall column
Description
Adds a column with overall summary statistics to tables
created by tbl_ard_summary()
.
Usage
## S3 method for class 'tbl_ard_summary'
add_overall(
x,
cards,
last = FALSE,
col_label = "**Overall**",
statistic = NULL,
...
)
Arguments
x |
( |
cards |
( |
last |
(scalar |
col_label |
( |
statistic |
( |
... |
These dots are for future extensions and must be empty. |
Value
A gtsummary
of same class as x
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
# build primary table
tbl <-
cards::ard_stack(
trial,
.by = trt,
cards::ard_continuous(variables = age),
cards::ard_categorical(variables = grade),
.missing = TRUE,
.attributes = TRUE,
.total_n = TRUE
) |>
tbl_ard_summary(by = trt)
# create ARD with overall results
ard_overall <-
cards::ard_stack(
trial,
cards::ard_continuous(variables = age),
cards::ard_categorical(variables = grade),
.missing = TRUE,
.attributes = TRUE,
.total_n = TRUE
)
# add an overall column
tbl |>
add_overall(cards = ard_overall)
Add p-values
Description
Usage
add_p(x, ...)
Arguments
x |
( |
... |
Passed to other methods. |
Author(s)
Daniel D. Sjoberg
Add p-values
Description
Add p-values
Usage
## S3 method for class 'tbl_continuous'
add_p(
x,
test = NULL,
pvalue_fun = label_style_pvalue(digits = 1),
include = everything(),
test.args = NULL,
group = NULL,
...
)
Arguments
x |
( |
test |
List of formulas specifying statistical tests to perform for each
variable.
Default is two-way ANOVA when |
pvalue_fun |
( |
include |
( |
test.args |
( |
group |
( |
... |
These dots are for future extensions and must be empty. |
Value
'tbl_continuous' object
Examples
# Example 1 ----------------------------------
trial |>
tbl_continuous(variable = age, by = trt, include = grade) |>
add_p(pvalue_fun = label_style_pvalue(digits = 2))
# Example 2 ----------------------------------
trial |>
tbl_continuous(variable = age, include = grade) |>
add_p(test = everything() ~ "kruskal.test")
Add p-value
Description
Calculate and add a p-value comparing the two variables in the cross table. If missing levels are included in the tables, they are also included in p-value calculation.
Usage
## S3 method for class 'tbl_cross'
add_p(
x,
test = NULL,
pvalue_fun = ifelse(source_note, label_style_pvalue(digits = 1, prepend_p = TRUE),
label_style_pvalue(digits = 1)),
source_note = FALSE,
test.args = NULL,
...
)
Arguments
x |
( |
test |
( |
pvalue_fun |
( |
source_note |
(scalar |
test.args |
(named |
... |
These dots are for future extensions and must be empty. |
Author(s)
Karissa Whiting, Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
trial |>
tbl_cross(row = stage, col = trt) |>
add_p()
# Example 2 ----------------------------------
trial |>
tbl_cross(row = stage, col = trt) |>
add_p(source_note = TRUE)
Add p-values
Description
Adds p-values to tables created by tbl_summary()
by comparing values across groups.
Usage
## S3 method for class 'tbl_summary'
add_p(
x,
test = NULL,
pvalue_fun = label_style_pvalue(digits = 1),
group = NULL,
include = everything(),
test.args = NULL,
adj.vars = NULL,
...
)
Arguments
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
pvalue_fun |
( |
group |
( |
include |
( |
test.args |
( |
adj.vars |
( |
... |
These dots are for future extensions and must be empty. |
Value
a gtsummary table of class "tbl_summary"
test argument
See the ?tests help file for details on available tests and creating custom tests. The ?tests help file also includes pseudo-code for each test to be clear precisely how the calculation is performed.
The default test used in add_p()
primarily depends on these factors:
whether the variable is categorical/dichotomous vs continuous
number of levels in the
tbl_summary(by)
variablewhether the
add_p(group)
argument is specifiedwhether the
add_p(adj.vars)
argument is specified
Specified neither add_p(group)
nor add_p(adj.vars)
-
"wilcox.test"
whenby
variable has two levels and variable is continuous. -
"kruskal.test"
whenby
variable has more than two levels and variable is continuous. -
"chisq.test.no.correct"
for categorical variables with all expected cell counts >=5, and"fisher.test"
for categorical variables with any expected cell count <5.
Specified add_p(group)
and not add_p(adj.vars)
-
"lme4"
whenby
variable has two levels for all summary types.
There is no default for grouped data when by
variable has more than two levels.
Users must create custom tests for this scenario.
Specified add_p(adj.vars)
and not add_p(group)
-
"ancova"
when variable is continuous andby
variable has two levels.
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(by = trt, include = c(age, grade)) |>
add_p()
# Example 2 ----------------------------------
trial |>
select(trt, age, marker) |>
tbl_summary(by = trt, missing = "no") |>
add_p(
# perform t-test for all variables
test = everything() ~ "t.test",
# assume equal variance in the t-test
test.args = all_tests("t.test") ~ list(var.equal = TRUE)
)
Add p-value
Description
Calculate and add a p-value to stratified tbl_survfit()
tables.
Usage
## S3 method for class 'tbl_survfit'
add_p(
x,
test = "logrank",
test.args = NULL,
pvalue_fun = label_style_pvalue(digits = 1),
include = everything(),
quiet,
...
)
Arguments
x |
( |
test |
( |
test.args |
(named |
pvalue_fun |
( |
include |
( |
quiet |
|
... |
These dots are for future extensions and must be empty. |
test argument
The most common way to specify test=
is by using a single string indicating
the test name. However, if you need to specify different tests within the same
table, the input in flexible using the list notation common throughout the
gtsummary package. For example, the following code would call the log-rank test,
and a second test of the G-rho family.
... |> add_p(test = list(trt ~ "logrank", grade ~ "survdiff"), test.args = grade ~ list(rho = 0.5))
Note
To calculate the p-values, the formula is re-constructed from the the call in the
original survfit()
object.
When the survfit()
object is created a for loop, lapply()
, purrr::map()
setting the call may not reflect the true formula which may result in an
error or an incorrect calculation.
To ensure correct results, the call formula in survfit()
must represent the
formula that will be used in survival::survdiff()
.
If you utilize the tbl_survfit.data.frame()
S3 method, this is handled for you.
See Also
Other tbl_survfit tools:
add_nevent.tbl_survfit()
Examples
library(survival)
gts_survfit <-
list(
survfit(Surv(ttdeath, death) ~ grade, trial),
survfit(Surv(ttdeath, death) ~ trt, trial)
) |>
tbl_survfit(times = c(12, 24))
# Example 1 ----------------------------------
gts_survfit |>
add_p()
# Example 2 ----------------------------------
# Pass `rho=` argument to `survdiff()`
gts_survfit |>
add_p(test = "survdiff", test.args = list(rho = 0.5))
Add p-values
Description
Adds p-values to tables created by tbl_svysummary()
by comparing values across groups.
Usage
## S3 method for class 'tbl_svysummary'
add_p(
x,
test = list(all_continuous() ~ "svy.wilcox.test", all_categorical() ~ "svy.chisq.test"),
pvalue_fun = label_style_pvalue(digits = 1),
include = everything(),
test.args = NULL,
...
)
Arguments
x |
( |
test |
( See below for details on default tests and ?tests for details on available tests and creating custom tests. |
pvalue_fun |
( |
include |
( |
test.args |
( |
... |
These dots are for future extensions and must be empty. |
Value
a gtsummary table of class "tbl_svysummary"
Examples
# Example 1 ----------------------------------
# A simple weighted dataset
survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |>
tbl_svysummary(by = Survived, include = c(Sex, Age)) |>
add_p()
# A dataset with a complex design
data(api, package = "survey")
d_clust <- survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)
# Example 2 ----------------------------------
tbl_svysummary(d_clust, by = both, include = c(api00, api99)) |>
add_p()
# Example 3 ----------------------------------
# change tests to svy t-test and Wald test
tbl_svysummary(d_clust, by = both, include = c(api00, api99, stype)) |>
add_p(
test = list(
all_continuous() ~ "svy.t.test",
all_categorical() ~ "svy.wald.test"
)
)
Add multiple comparison adjustment
Description
Adjustments to p-values are performed with stats::p.adjust()
.
Usage
add_q(x, method = "fdr", pvalue_fun = NULL, quiet = NULL)
Arguments
x |
( |
method |
( |
pvalue_fun |
( |
quiet |
Author(s)
Daniel D. Sjoberg, Esther Drill
Examples
# Example 1 ----------------------------------
add_q_ex1 <-
trial |>
tbl_summary(by = trt, include = c(trt, age, grade, response)) |>
add_p() |>
add_q()
# Example 2 ----------------------------------
trial |>
tbl_uvregression(
y = response,
include = c("trt", "age", "grade"),
method = glm,
method.args = list(family = binomial),
exponentiate = TRUE
) |>
add_global_p() |>
add_q()
Add significance stars
Description
Add significance stars to estimates with small p-values
Usage
add_significance_stars(
x,
pattern = ifelse(inherits(x, c("tbl_regression", "tbl_uvregression")),
"{estimate}{stars}", "{p.value}{stars}"),
thresholds = c(0.001, 0.01, 0.05),
hide_ci = TRUE,
hide_p = inherits(x, c("tbl_regression", "tbl_uvregression")),
hide_se = FALSE
)
Arguments
x |
( |
pattern |
( |
thresholds |
( |
hide_ci |
(scalar |
hide_p |
(scalar |
hide_se |
(scalar |
Value
a 'gtsummary' table
Examples
tbl <-
lm(time ~ ph.ecog + sex, survival::lung) |>
tbl_regression(label = list(ph.ecog = "ECOG Score", sex = "Sex"))
# Example 1 ----------------------------------
tbl |>
add_significance_stars(hide_ci = FALSE, hide_p = FALSE)
# Example 2 ----------------------------------
tbl |>
add_significance_stars(
pattern = "{estimate} ({conf.low}, {conf.high}){stars}",
hide_ci = TRUE, hide_se = TRUE
) |>
modify_header(estimate = "**Beta (95% CI)**") |>
modify_abbreviation("CI = Confidence Interval")
# Example 3 ----------------------------------
# Use ' \n' to put a line break between beta and SE
tbl |>
add_significance_stars(
hide_se = TRUE,
pattern = "{estimate}{stars} \n({std.error})"
) |>
modify_header(estimate = "**Beta \n(SE)**") |>
modify_abbreviation("SE = Standard Error") |>
as_gt() |>
gt::fmt_markdown(columns = everything()) |>
gt::tab_style(
style = "vertical-align:top",
locations = gt::cells_body(columns = label)
)
# Example 4 ----------------------------------
lm(marker ~ stage + grade, data = trial) |>
tbl_regression() |>
add_global_p() |>
add_significance_stars(
hide_p = FALSE,
pattern = "{p.value}{stars}"
)
Add a custom statistic
Description
The function allows a user to add a new column (or columns) of statistics to an
existing tbl_summary
, tbl_svysummary
, or tbl_continuous
object.
Usage
add_stat(x, fns, location = everything() ~ "label")
Arguments
x |
( |
fns |
( |
location |
( |
Value
A 'gtsummary' of the same class as the input
Details
The returns from custom functions passed in fns=
are required to follow a
specified format. Each of these function will execute on a single variable.
Each function must return a tibble or a vector. If a vector is returned, it will be converted to a tibble with one column and number of rows equal to the length of the vector.
When
location='label'
, the returned statistic from the custom function must be a tibble with one row. Whenlocation='level'
the tibble must have the same number of rows as there are levels in the variable (excluding the row for unknown values).Each function may take the following arguments:
foo(data, variable, by, tbl, ...)
-
data=
is the input data frame passed totbl_summary()
-
variable=
is a string indicating the variable to perform the calculation on. This is the variable in the label column of the table. -
by=
is a string indicating the by variable fromtbl_summary=
, if present -
tbl=
the originaltbl_summary()
/tbl_svysummary()
object is also available to utilize
-
The user-defined function does not need to utilize each of these inputs. It's
encouraged the user-defined function accept ...
as each of the arguments
will be passed to the function, even if not all inputs are utilized by
the user's function, e.g. foo(data, variable, by, ...)
Use
modify_header()
to update the column headersUse
modify_fmt_fun()
to update the functions that format the statisticsUse
modify_footnote_header()
to add a explanatory footnote
If you return a tibble with column names p.value
or q.value
, default
p-value formatting will be applied, and you may take advantage of subsequent
p-value formatting functions, such as bold_p()
or add_q()
.
Examples
# Example 1 ----------------------------------
# fn returns t-test pvalue
my_ttest <- function(data, variable, by, ...) {
t.test(data[[variable]] ~ as.factor(data[[by]]))$p.value
}
trial |>
tbl_summary(
by = trt,
include = c(trt, age, marker),
missing = "no"
) |>
add_stat(fns = everything() ~ my_ttest) |>
modify_header(add_stat_1 = "**p-value**", all_stat_cols() ~ "**{level}**")
# Example 2 ----------------------------------
# fn returns t-test test statistic and pvalue
my_ttest2 <- function(data, variable, by, ...) {
t.test(data[[variable]] ~ as.factor(data[[by]])) |>
broom::tidy() %>%
dplyr::mutate(
stat = glue::glue("t={style_sigfig(statistic)}, {style_pvalue(p.value, prepend_p = TRUE)}")
) %>%
dplyr::pull(stat)
}
trial |>
tbl_summary(
by = trt,
include = c(trt, age, marker),
missing = "no"
) |>
add_stat(fns = everything() ~ my_ttest2) |>
modify_header(add_stat_1 = "**Treatment Comparison**")
# Example 3 ----------------------------------
# return test statistic and p-value is separate columns
my_ttest3 <- function(data, variable, by, ...) {
t.test(data[[variable]] ~ as.factor(data[[by]])) %>%
broom::tidy() %>%
select(statistic, p.value)
}
trial |>
tbl_summary(
by = trt,
include = c(trt, age, marker),
missing = "no"
) |>
add_stat(fns = everything() ~ my_ttest3) |>
modify_header(statistic = "**t-statistic**", p.value = "**p-value**") |>
modify_fmt_fun(statistic = label_style_sigfig(), p.value = label_style_pvalue(digits = 2))
Add statistic labels
Description
Adds or modifies labels describing the summary statistics presented for
each variable in a tbl_summary()
table.
Usage
add_stat_label(x, ...)
## S3 method for class 'tbl_summary'
add_stat_label(x, location = c("row", "column"), label = NULL, ...)
## S3 method for class 'tbl_svysummary'
add_stat_label(x, location = c("row", "column"), label = NULL, ...)
## S3 method for class 'tbl_ard_summary'
add_stat_label(x, location = c("row", "column"), label = NULL, ...)
Arguments
x |
( |
... |
These dots are for future extensions and must be empty. |
location |
( |
label |
( |
Value
A tbl_summary
or tbl_svysummary
object
Tips
When using add_stat_label(location='row')
with subsequent tbl_merge()
,
it's important to have somewhat of an understanding of the underlying
structure of the gtsummary table.
add_stat_label(location='row')
works by adding a new column called
"stat_label"
to x$table_body
. The "label"
and "stat_label"
columns are merged when the gtsummary table is printed.
The tbl_merge()
function merges on the "label"
column (among others),
which is typically the first column you see in a gtsummary table.
Therefore, when you want to merge a table that has run add_stat_label(location='row')
you need to match the "label"
column values before the "stat_column"
is merged with it.
For example, the following two tables merge properly
tbl1 <- trial %>% select(age, grade) |> tbl_summary() |> add_stat_label() tbl2 <- lm(marker ~ age + grade, trial) |> tbl_regression() tbl_merge(list(tbl1, tbl2))
The addition of the new "stat_label"
column requires a default
labels for categorical variables, which is "No. (%)"
. This
can be changed to either desired text or left blank using NA_character_
.
The blank option is useful in the location="row"
case to keep the
output for categorical variables identical what was produced without
a "add_stat_label()"
function call.
Author(s)
Daniel D. Sjoberg
Examples
tbl <- trial |>
dplyr::select(trt, age, grade, response) |>
tbl_summary(by = trt)
# Example 1 ----------------------------------
# Add statistic presented to the variable label row
tbl |>
add_stat_label(
# update default statistic label for continuous variables
label = all_continuous() ~ "med. (iqr)"
)
# Example 2 ----------------------------------
tbl |>
add_stat_label(
# add a new column with statistic labels
location = "column"
)
# Example 3 ----------------------------------
trial |>
select(age, grade, trt) |>
tbl_summary(
by = trt,
type = all_continuous() ~ "continuous2",
statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min} - {max}"),
) |>
add_stat_label(label = age ~ c("IQR", "Range"))
Variable Group Header
Description
Some data are inherently grouped, and should be reported together. Grouped variables are all indented together. This function indents the variables that should be reported together while adding a header above the group.
Usage
add_variable_group_header(x, header, variables, indent = 4L)
Arguments
x |
( |
header |
( |
variables |
( |
indent |
( |
Details
This function works by inserting a row into the x$table_body
and
indenting the group of selected variables.
This function cannot be used in conjunction with all functions in gtsummary;
for example, bold_labels()
will bold the incorrect rows after running
this function.
Value
a gtsummary table
Examples
# Example 1 ----------------------------------
set.seed(11234)
data.frame(
exclusion_age = sample(c(TRUE, FALSE), 20, replace = TRUE),
exclusion_mets = sample(c(TRUE, FALSE), 20, replace = TRUE),
exclusion_physician = sample(c(TRUE, FALSE), 20, replace = TRUE)
) |>
tbl_summary(
label = list(exclusion_age = "Age",
exclusion_mets = "Metastatic Disease",
exclusion_physician = "Physician")
) |>
add_variable_group_header(
header = "Exclusion Reason",
variables = starts_with("exclusion_")
) |>
modify_caption("**Study Exclusion Criteria**")
# Example 2 ----------------------------------
lm(marker ~ trt + grade + age, data = trial) |>
tbl_regression() |>
add_global_p(keep = TRUE, include = grade) |>
add_variable_group_header(
header = "Treatment:",
variables = trt
) |>
add_variable_group_header(
header = "Covariate:",
variables = -trt
) |>
# indent levels 8 spaces
modify_indent(
columns = "label",
rows = row_type == "level",
indent = 8L
)
Add Variance Inflation Factor
Description
Add the variance inflation factor (VIF) or
generalized VIF (GVIF) to the regression table.
Function uses car::vif()
to calculate the VIF.
Usage
add_vif(x, statistic = NULL, estimate_fun = label_style_sigfig(digits = 2))
Arguments
x |
|
statistic |
|
estimate_fun |
Default is |
See Also
Review list, formula, and selector syntax used throughout gtsummary
Examples
# Example 1 ----------------------------------
lm(age ~ grade + marker, trial) |>
tbl_regression() |>
add_vif()
# Example 2 ----------------------------------
lm(age ~ grade + marker, trial) |>
tbl_regression() |>
add_vif(c("aGVIF", "df"))
Convert gtsummary object to a flextable object
Description
Function converts a gtsummary object to a flextable object. A user can use this function if they wish to add customized formatting available via the flextable functions. The flextable output is particularly useful when combined with R markdown with Word output, since the gt package does not support Word.
Usage
as_flex_table(x, include = everything(), return_calls = FALSE, ...)
Arguments
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
... |
Not used |
Details
The as_flex_table()
function supports bold and italic markdown syntax in column headers
and spanning headers ('**'
and '_'
only).
Text wrapped in double stars ('**bold**'
) will be made bold, and text between single
underscores ('_italic_'
) will be made italic.
No other markdown syntax is supported and the double-star and underscore cannot be combined.
To further style your table, you may convert the table to flextable with
as_flex_table()
, then utilize any of the flextable functions.
Value
A 'flextable' object
Author(s)
Daniel D. Sjoberg
Examples
trial |>
select(trt, age, grade) |>
tbl_summary(by = trt) |>
add_p() |>
as_flex_table()
Convert gtsummary object to gt
Description
Function converts a gtsummary object to a "gt_tbl"
object,
that is, a table created with gt::gt()
.
Function is used in the background when the results are printed or knit.
A user can use this function if they wish to add customized formatting
available via the gt package.
Usage
as_gt(x, include = everything(), return_calls = FALSE, ...)
Arguments
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
... |
Arguments passed on to |
Value
A gt_tbl
object
Note
As of 2024-08-15, line breaks (e.g. '\n'
) do not render properly for PDF output.
For now, these line breaks are stripped when rendering to PDF with Quarto and R markdown.
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(by = trt, include = c(age, grade, response)) |>
as_gt()
Create gtsummary table
Description
This function ingests a data frame and adds the infrastructure around it to make it a gtsummary object.
Usage
as_gtsummary(table_body, ...)
Arguments
table_body |
( |
... |
other objects that will be added to the gtsummary object list |
Details
Function uses table_body
to create a gtsummary object
Value
gtsummary object
Examples
mtcars[1:2, 1:2] |>
as_gtsummary()
Convert gtsummary object to a huxtable object
Description
Function converts a gtsummary object to a huxtable object. A user can use this function if they wish to add customized formatting available via the huxtable functions. The huxtable package supports output to PDF via LaTeX, as well as HTML and Word.
Usage
as_hux_table(x, include = everything(), return_calls = FALSE)
as_hux_xlsx(x, file, include = everything(), bold_header_rows = TRUE)
Arguments
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
file |
File path for the output. |
bold_header_rows |
(scalar |
Value
A {huxtable} object
Excel Output
Use the as_hux_xlsx()
function to save a copy of the table in an excel file.
The file is saved using huxtable::quick_xlsx()
.
Author(s)
David Hugh-Jones, Daniel D. Sjoberg
Examples
trial |>
tbl_summary(by = trt, include = c(age, grade)) |>
add_p() |>
as_hux_table()
Convert gtsummary object to a kable object
Description
Output from knitr::kable()
is less full featured compared to
summary tables produced with gt.
For example, kable summary tables do not include indentation, footnotes,
or spanning header rows.
Line breaks (\n
) are removed from column headers and table cells.
Usage
as_kable(x, ..., include = everything(), return_calls = FALSE)
Arguments
x |
( |
... |
Additional arguments passed to |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
return_calls |
Logical. Default is |
Details
Tip: To better distinguish variable labels and level labels when
indenting is not supported, try bold_labels()
or italicize_levels()
.
Value
A knitr_kable
object
Author(s)
Daniel D. Sjoberg
Examples
trial |>
tbl_summary(by = trt) |>
bold_labels() |>
as_kable()
Convert gtsummary object to a kableExtra object
Description
Function converts a gtsummary object to a knitr_kable + kableExtra object.
This allows the customized formatting available via knitr::kable()
and {kableExtra}; as_kable_extra()
supports arguments in knitr::kable()
.
as_kable_extra()
output via gtsummary supports
bold and italic cells for table bodies. Users
are encouraged to leverage as_kable_extra()
for enhanced pdf printing; for html
output options there is better support via as_gt()
.
Usage
as_kable_extra(
x,
escape = FALSE,
format = NULL,
...,
include = everything(),
addtl_fmt = TRUE,
return_calls = FALSE
)
Arguments
x |
( |
format , escape , ... |
arguments passed to |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
addtl_fmt |
logical indicating whether to include additional formatting.
Default is |
return_calls |
Logical. Default is |
Value
A {kableExtra} table
PDF/LaTeX
This section shows options intended for use with output: pdf_document
in yaml of .Rmd
.
When the default values of as_kable_extra(escape = FALSE, addtl_fmt = TRUE)
are utilized, the following formatting occurs.
Markdown bold, italic, and underline syntax in the headers, spanning headers, caption, and footnote will be converted to escaped LaTeX code
Special characters in the table body, headers, spanning headers, caption, and footnote will be escaped with
.escape_latex()
or.escape_latex2()
The
"\n"
symbol will be recognized as a line break in the table headers, spanning headers, caption, and the table bodyThe
"\n"
symbol is removed from the footnotes
To suppress these additional formats, set as_kable_extra(addtl_fmt = FALSE)
Additional styling is available with
kableExtra::kable_styling()
as shown in Example 2, which implements row
striping and repeated column headers in the presence of page breaks.
HTML
This section discusses options intended for use with output: html_document
in yaml of .Rmd
.
When the default values of as_kable_extra(escape = FALSE, addtl_fmt = TRUE)
are utilized, the following formatting occurs.
The default markdown syntax in the headers and spanning headers is removed
Special characters in the table body, headers, spanning headers, caption, and footnote will be escaped with
.escape_html()
The
"\n"
symbol is removed from the footnotes
To suppress the additional formatting, set as_kable_extra(addtl_fmt = FALSE)
Author(s)
Daniel D. Sjoberg
Examples
# basic gtsummary tbl to build upon
as_kable_extra_base <-
trial |>
tbl_summary(by = trt, include = c(age, stage)) |>
bold_labels()
# Example 1 (PDF via LaTeX) ---------------------
# add linebreak in table header with '\n'
as_kable_extra_ex1_pdf <-
as_kable_extra_base |>
modify_header(all_stat_cols() ~ "**{level}** \n*N = {n}*") |>
as_kable_extra()
# Example 2 (PDF via LaTeX) ---------------------
# additional styling in `knitr::kable()` and with
# call to `kableExtra::kable_styling()`
as_kable_extra_ex2_pdf <-
as_kable_extra_base |>
as_kable_extra(
booktabs = TRUE,
longtable = TRUE,
linesep = ""
) |>
kableExtra::kable_styling(
position = "left",
latex_options = c("striped", "repeat_header"),
stripe_color = "gray!15"
)
Convert gtsummary object to a tibble
Description
Function converts a gtsummary object to a tibble.
Usage
## S3 method for class 'gtsummary'
as_tibble(
x,
include = everything(),
col_labels = TRUE,
return_calls = FALSE,
fmt_missing = FALSE,
...
)
## S3 method for class 'gtsummary'
as.data.frame(...)
Arguments
x |
( |
include |
Commands to include in output. Input may be a vector of
quoted or unquoted names. tidyselect and gtsummary select helper
functions are also accepted.
Default is |
col_labels |
(scalar |
return_calls |
Logical. Default is |
fmt_missing |
(scalar |
... |
Arguments passed on to |
Value
a tibble
Author(s)
Daniel D. Sjoberg
Examples
tbl <-
trial |>
tbl_summary(by = trt, include = c(age, grade, response))
as_tibble(tbl)
# without column labels
as_tibble(tbl, col_labels = FALSE)
Assign Default Digits
Description
Used to assign the default formatting for variables summarized with
tbl_summary()
.
Usage
assign_summary_digits(data, statistic, type, digits = NULL)
Arguments
data |
( |
statistic |
( |
type |
( |
digits |
( |
Value
a named list
Examples
assign_summary_digits(
mtcars,
statistic = list(mpg = "{mean}"),
type = list(mpg = "continuous")
)
Assign Default Summary Type
Description
Function inspects data and assigns a summary type when not specified
in the type
argument.
Usage
assign_summary_type(data, variables, value, type = NULL, cat_threshold = 10L)
Arguments
data |
( |
variables |
( |
value |
( |
type |
( |
cat_threshold |
( |
Value
named list
Examples
assign_summary_type(
data = trial,
variables = c("age", "grade", "response"),
value = NULL
)
Assign Test
Description
This function is used to assign default tests for add_p()
and add_difference()
.
Usage
assign_tests(x, ...)
## S3 method for class 'tbl_summary'
assign_tests(
x,
include,
by = x$inputs$by,
test = NULL,
group = NULL,
adj.vars = NULL,
summary_type = x$inputs$type,
calling_fun = c("add_p", "add_difference"),
...
)
## S3 method for class 'tbl_svysummary'
assign_tests(
x,
include,
by = x$inputs$by,
test = NULL,
group = NULL,
adj.vars = NULL,
summary_type = x$inputs$type,
calling_fun = c("add_p", "add_difference"),
...
)
## S3 method for class 'tbl_continuous'
assign_tests(x, include, by, cont_variable, test = NULL, group = NULL, ...)
## S3 method for class 'tbl_survfit'
assign_tests(x, include, test = NULL, ...)
Arguments
x |
( |
... |
Passed to |
include |
( |
by |
( |
test |
(named |
group |
( |
adj.vars |
( |
summary_type |
(named |
calling_fun |
( |
cont_variable |
( |
Value
A table of class 'gtsummary'
Examples
trial |>
tbl_summary(
by = trt,
include = c(age, stage)
) |>
assign_tests(include = c("age", "stage"), calling_fun = "add_p")
Bold or Italicize
Description
Bold or italicize labels or levels in gtsummary tables
Usage
bold_labels(x)
italicize_labels(x)
bold_levels(x)
italicize_levels(x)
## S3 method for class 'gtsummary'
bold_labels(x)
## S3 method for class 'gtsummary'
bold_levels(x)
## S3 method for class 'gtsummary'
italicize_labels(x)
## S3 method for class 'gtsummary'
italicize_levels(x)
## S3 method for class 'tbl_cross'
bold_labels(x)
## S3 method for class 'tbl_cross'
bold_levels(x)
## S3 method for class 'tbl_cross'
italicize_labels(x)
## S3 method for class 'tbl_cross'
italicize_levels(x)
Arguments
x |
( |
Value
Functions return the same class of gtsummary object supplied
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
tbl_summary(trial, include = c("trt", "age", "response")) |>
bold_labels() |>
bold_levels() |>
italicize_labels() |>
italicize_levels()
Bold significant p-values
Description
Bold values below a chosen threshold (e.g. <0.05) in a gtsummary tables.
Usage
bold_p(x, t = 0.05, q = FALSE)
Arguments
x |
( |
t |
(scalar |
q |
(scalar |
Author(s)
Daniel D. Sjoberg, Esther Drill
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(by = trt, include = c(response, marker, trt), missing = "no") |>
add_p() |>
bold_p(t = 0.1)
# Example 2 ----------------------------------
glm(response ~ trt + grade, trial, family = binomial(link = "logit")) |>
tbl_regression(exponentiate = TRUE) |>
bold_p(t = 0.65)
Continuous Summary Table Bridges
Description
Bridge function for converting tbl_continuous()
cards to basic gtsummary objects.
This bridge function converts the 'cards' object to a format suitable to
pass to brdg_summary()
: no pier_*()
functions required.
Usage
brdg_continuous(cards, by = NULL, statistic, include, variable, type)
Arguments
cards |
( |
by |
( |
statistic |
(named |
include |
( |
variable |
( |
type |
(named |
Value
a gtsummary object
Examples
library(cards)
bind_ard(
# the primary ARD with the results
ard_continuous(trial, by = grade, variables = age),
# add missing and attributes ARD
ard_missing(trial, by = grade, variables = age),
ard_attributes(trial, variables = c(grade, age))
) |>
# adding the column name
dplyr::mutate(
gts_column =
ifelse(!context %in% "attributes", "stat_0", NA_character_)
) |>
brdg_continuous(
variable = "age",
include = "grade",
statistic = list(grade = "{median} ({p25}, {p75})"),
type = list(grade = "categorical")
) |>
as_tibble()
Hierarchy table bridge
Description
Bridge function for converting tbl_hierarchical()
(and similar) cards to basic gtsummary objects.
All bridge functions begin with prefix brdg_*()
.
This file also contains helper functions for constructing the bridge,
referred to as the piers (supports for a bridge) and begin with pier_*()
.
-
brdg_hierarchical()
: The bridge function ingests an ARD data frame and returns a gtsummary table that includes.$table_body
and a basic.$table_styling
. The.$table_styling$header
data frame includes the header statistics. Based on context, this function adds a column to the ARD data frame named"gts_column"
. This column is used during the reshaping in thepier_*()
functions defining column names. -
pier_*()
: these functions accept a cards tibble and returns a tibble that is a piece of the.$table_body
. Typically these will be stacked to construct the final table body data frame. The ARD object passed here will have two primary parts: the calculated summary statistics and the attributes ARD. The attributes ARD is used for labeling. The ARD data frame passed to this function must include a"gts_column"
column, which is added inbrdg_hierarchical()
.
Usage
brdg_hierarchical(
cards,
variables,
by,
include,
statistic,
overall_row,
count,
is_ordered,
label
)
pier_summary_hierarchical(cards, variables, include, statistic)
Arguments
cards |
( |
variables |
( |
by |
( |
include |
( |
statistic |
(named |
overall_row |
(scalar |
count |
(scalar |
is_ordered |
(scalar |
label |
(named |
Value
a gtsummary object
See Also
Review list, formula, and selector syntax used throughout gtsummary
Summary table bridge
Description
Bridge function for converting tbl_summary()
(and similar) cards to basic gtsummary objects.
All bridge functions begin with prefix brdg_*()
.
This file also contains helper functions for constructing the bridge,
referred to as the piers (supports for a bridge) and begin with pier_*()
.
-
brdg_summary()
: The bridge function ingests an ARD data frame and returns a gtsummary table that includes.$table_body
and a basic.$table_styling
. The.$table_styling$header
data frame includes the header statistics. Based on context, this function adds a column to the ARD data frame named"gts_column"
. This column is used during the reshaping in thepier_*()
functions defining column names. -
pier_*()
: these functions accept a cards tibble and returns a tibble that is a piece of the.$table_body
. Typically these will be stacked to construct the final table body data frame. The ARD object passed here will have two primary parts: the calculated summary statistics and the attributes ARD. The attributes ARD is used for labeling. The ARD data frame passed to this function must include a"gts_column"
column, which is added inbrdg_summary()
.
Usage
brdg_summary(
cards,
variables,
type,
statistic,
by = NULL,
missing = "no",
missing_stat = "{N_miss}",
missing_text = "Unknown"
)
pier_summary_dichotomous(cards, variables, statistic)
pier_summary_categorical(cards, variables, statistic)
pier_summary_continuous2(cards, variables, statistic)
pier_summary_continuous(cards, variables, statistic)
pier_summary_missing_row(
cards,
variables,
missing = "no",
missing_stat = "{N_miss}",
missing_text = "Unknown"
)
Arguments
cards |
( |
variables |
( |
type |
(named |
statistic |
(named |
by |
( |
missing , missing_text , missing_stat |
Arguments dictating how and if missing values are presented:
|
Value
a gtsummary object
Examples
library(cards)
# first build ARD data frame
cards <-
ard_stack(
mtcars,
ard_continuous(variables = c("mpg", "hp")),
ard_categorical(variables = "cyl"),
ard_dichotomous(variables = "am"),
.missing = TRUE,
.attributes = TRUE
) |>
# this column is used by the `pier_*()` functions
dplyr::mutate(gts_column = ifelse(context == "attributes", NA, "stat_0"))
brdg_summary(
cards = cards,
variables = c("cyl", "am", "mpg", "hp"),
type =
list(
cyl = "categorical",
am = "dichotomous",
mpg = "continuous",
hp = "continuous2"
),
statistic =
list(
cyl = "{n} / {N}",
am = "{n} / {N}",
mpg = "{mean} ({sd})",
hp = c("{median} ({p25}, {p75})", "{mean} ({sd})")
)
) |>
as_tibble()
pier_summary_dichotomous(
cards = cards,
variables = "am",
statistic = list(am = "{n} ({p})")
)
pier_summary_categorical(
cards = cards,
variables = "cyl",
statistic = list(cyl = "{n} ({p})")
)
pier_summary_continuous2(
cards = cards,
variables = "hp",
statistic = list(hp = c("{median}", "{mean}"))
)
pier_summary_continuous(
cards = cards,
variables = "mpg",
statistic = list(mpg = "{median}")
)
Wide summary table bridge
Description
Bridge function for converting tbl_wide_summary()
(and similar) cards to basic gtsummary objects.
All bridge functions begin with prefix brdg_*()
.
Usage
brdg_wide_summary(cards, variables, statistic, type)
Arguments
cards |
( |
variables |
( |
statistic |
(named |
type |
(named |
Value
a gtsummary object
Examples
library(cards)
bind_ard(
ard_continuous(trial, variables = c(age, marker)),
ard_attributes(trial, variables = c(age, marker))
) |>
brdg_wide_summary(
variables = c("age", "marker"),
statistic = list(age = c("{mean}", "{sd}"), marker = c("{mean}", "{sd}")),
type = list(age = "continuous", marker = "continuous")
)
Combine terms
Description
The function combines terms from a regression model, and replaces the terms
with a single row in the output table. The p-value is calculated using
stats::anova()
.
Usage
combine_terms(x, formula_update, label = NULL, quiet, ...)
Arguments
x |
( |
formula_update |
( |
label |
( |
quiet |
|
... |
Additional arguments passed to stats::anova |
Value
tbl_regression
object
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
# Logistic Regression Example, LRT p-value
glm(response ~ marker + I(marker^2) + grade,
trial[c("response", "marker", "grade")] |> na.omit(), # keep complete cases only!
family = binomial) |>
tbl_regression(label = grade ~ "Grade", exponentiate = TRUE) |>
# collapse non-linear terms to a single row in output using anova
combine_terms(
formula_update = . ~ . - marker - I(marker^2),
label = "Marker (non-linear terms)",
test = "LRT"
)
Summarize a continuous variable
Description
This helper, to be used with tbl_custom_summary()
, creates a function
summarizing a continuous variable.
Usage
continuous_summary(variable)
Arguments
variable |
( |
Details
When using continuous_summary()
, you can specify in the statistic=
argument
of tbl_custom_summary()
the same continuous statistics than in
tbl_summary()
. See the statistic argument section of the help file of
tbl_summary()
.
Author(s)
Joseph Larmarange
Custom tidiers
Description
Collection of tidiers that can be utilized in gtsummary. See details below.
Usage
tidy_standardize(
x,
exponentiate = FALSE,
conf.level = 0.95,
conf.int = TRUE,
...,
quiet = FALSE
)
tidy_bootstrap(
x,
exponentiate = FALSE,
conf.level = 0.95,
conf.int = TRUE,
...,
quiet = FALSE
)
tidy_robust(
x,
exponentiate = FALSE,
conf.level = 0.95,
conf.int = TRUE,
vcov = NULL,
vcov_args = NULL,
...,
quiet = FALSE
)
pool_and_tidy_mice(x, pool.args = NULL, ..., quiet = FALSE)
tidy_gam(x, conf.int = FALSE, exponentiate = FALSE, conf.level = 0.95, ...)
tidy_wald_test(x, tidy_fun = NULL, vcov = stats::vcov(x), ...)
Arguments
x |
( |
exponentiate |
(scalar |
conf.level |
(scalar |
conf.int |
(scalar |
... |
Arguments passed to method;
|
quiet |
|
vcov , vcov_args |
|
pool.args |
(named |
tidy_fun |
( |
Regression Model Tidiers
These tidiers are passed to tbl_regression()
and tbl_uvregression()
to
obtain modified results.
-
tidy_standardize()
tidier to report standardized coefficients. The parameters package includes a wonderful function to estimate standardized coefficients. The tidier uses the output fromparameters::standardize_parameters()
, and merely takes the result and puts it inbroom::tidy()
format. -
tidy_bootstrap()
tidier to report bootstrapped coefficients. The parameters package includes a wonderful function to estimate bootstrapped coefficients. The tidier uses the output fromparameters::bootstrap_parameters(test = "p")
, and merely takes the result and puts it inbroom::tidy()
format. -
tidy_robust()
tidier to report robust standard errors, confidence intervals, and p-values. The parameters package includes a wonderful function to calculate robust standard errors, confidence intervals, and p-values The tidier uses the output fromparameters::model_parameters()
, and merely takes the result and puts it inbroom::tidy()
format. To use this function withtbl_regression()
, pass a function with the arguments fortidy_robust()
populated. -
pool_and_tidy_mice()
tidier to report models resulting from multiply imputed data using the mice package. Pass the mice model object before the model results have been pooled. See example.
Other Tidiers
-
tidy_wald_test()
tidier to report Wald p-values, wrapping theaod::wald.test()
function. Use this tidier withadd_global_p(anova_fun = tidy_wald_test)
Examples
# Example 1 ----------------------------------
mod <- lm(age ~ marker + grade, trial)
tbl_stnd <- tbl_regression(mod, tidy_fun = tidy_standardize)
tbl <- tbl_regression(mod)
tidy_standardize_ex1 <-
tbl_merge(
list(tbl_stnd, tbl),
tab_spanner = c("**Standardized Model**", "**Original Model**")
)
# Example 2 ----------------------------------
# use "posthoc" method for coef calculation
tbl_regression(mod, tidy_fun = \(x, ...) tidy_standardize(x, method = "posthoc", ...))
# Example 3 ----------------------------------
# Multiple Imputation using the mice package
set.seed(1123)
pool_and_tidy_mice_ex3 <-
suppressWarnings(mice::mice(trial, m = 2)) |>
with(lm(age ~ marker + grade)) |>
tbl_regression()
Default Statistics Labels
Description
Default Statistics Labels
Usage
default_stat_labels()
Value
named list
Deprecated functions
Description
Some functions have been deprecated and are no longer being actively
supported.
Usage
modify_column_indent(...)
tbl_split(x, ...)
## S3 method for class 'gtsummary'
tbl_split(...)
Column "ci"
Deprecated
Description
Overview
When the gtsummary package was first written, the gt package was not on CRAN
and the version of the package that was available did not have the ability
to merge columns.
Due to these limitations, the pre-formatted "ci"
column was added to show the combined
"conf.low"
and "conf.high"
columns.
Column merging in both gt and gtsummary packages has matured over the years,
and we are now adopting a more modern approach by using these features.
As a result, the pre-formatted "ci"
column will eventually be dropped from .$table_body
.
By using column merging, the conf.low
and conf.high
remain numeric
and we can to continue to update how these columns are formatted, even after printing the table.
The "ci"
column is hidden, meaning that it appears in .$table_body
, but is not printed.
This means that references to the column in your code will not error, but will likely not have the intended effect.
How to update?
In most cases it is a simple change to adapt your code to the updated
structure: simply swap ci
with conf.low
.
See below for examples on how to update your code.
modify_header()
While the "ci"
column is hidden, if a new header is defined for the column it will be unhidden.
Code that changes the header of "ci"
will likely lead to duplicate columns appearing in your table
(that is, the "ci"
column and the merged "conf.low"
and "conf.high"
columns).
Old Code | Updated Code |
modify_header(ci = "Confidence Interval") | modify_header(conf.low = "Confidence Interval") |
modify_spanning_header()
Old Code | Updated Code |
modify_spanning_header(ci = "Confidence Interval") | modify_spanning_header(conf.low = "Confidence Interval") |
modify_spanning_header()
Old Code | Updated Code |
modify_spanning_header(ci = "Confidence Interval") | modify_spanning_header(conf.low = "Confidence Interval") |
modify_column_merge()
Old Code | Updated Code |
modify_column_merge(pattern = "{estimate} ({ci})") | modify_column_merge(pattern = "{estimate} ({conf.low}, {conf.high})" |
modify_column_hide()
Old Code | Updated Code |
modify_column_hide(columns = "ci") | modify_column_hide(columns = "conf.low") |
inline_text()
Old Code | Updated Code |
inline_text(pattern = "{estimate} (95% CI {ci})") | inline_text(pattern = "{estimate} (95% CI {conf.low}, {conf.high})") |
DEPRECATED Footnote
Description
Use modify_footnote_header()
and modify_abbreviation()
instead.
Usage
modify_footnote(
x,
...,
abbreviation = FALSE,
text_interpret = c("md", "html"),
update,
quiet
)
Arguments
x |
( |
... |
|
abbreviation |
(scalar |
text_interpret |
( |
update , quiet |
Value
Updated gtsummary object
Examples
# Use `modify_footnote_header()`, `modify_footnote_body()`, `modify_abbreviation()` instead.
Filter Hierarchical Tables
Description
This function is used to filter hierarchical table rows. Filters are not applied to summary or overall rows.
Usage
filter_hierarchical(x, ...)
## S3 method for class 'tbl_hierarchical'
filter_hierarchical(x, filter, keep_empty = FALSE, ...)
## S3 method for class 'tbl_hierarchical_count'
filter_hierarchical(x, filter, keep_empty = FALSE, ...)
Arguments
x |
( |
... |
These dots are for future extensions and must be empty. |
filter |
( |
keep_empty |
(scalar |
Details
The filter
argument can be used to filter out rows of a table which do not meet the criteria provided as an
expression. Rows can be filtered on the values of any of the possible statistics (n
, p
, and N
) provided they
are included at least once in the table, as well as the values of any by
variables. Filtering is only applied to
rows that correspond to the innermost variable in the hierarchy - all outer variable (summary) rows are kept
regardless of whether they meet the filtering criteria themselves. In addition to filtering on individual statistic
values, filters can be applied across the row (i.e. across all by
variable values) by using aggregate functions
such as sum()
and mean()
.
If an overall column was added to the table (via add_overall())
) this column will not be used in any filters (i.e.
sum(n)
will not include the overall n
in a given row). To filter on overall statistics use the sum()
function
in your filter instead (i.e. sum(n)
is equal to the overall column n
across any by
variables).
Some examples of possible filters:
-
filter = n > 5
: keep rows where one of the treatment groups observed more than 5 AEs -
filter = n == 2 & p < 0.05
: keep rows where one of the treatment groups observed exactly 2 AEs and one of the treatment groups observed a proportion less than 5%. -
filter = sum(n) >= 4
: keep rows where there were 4 or more AEs observed across the row -
filter = mean(n) > 4 | n > 3
: keep rows where the mean number of AEs is 4 or more across the row or one of the treatment groups observed more than 3 AEs -
filter = any(n > 2 & TRTA == "Xanomeline High Dose")
: keep rows where the"Xanomeline High Dose"
treatment group observed more than 2 AEs
Value
A gtsummary
of the same class as x
.
See Also
Examples
ADAE_subset <- cards::ADAE |>
dplyr::filter(AEBODSYS %in% c("SKIN AND SUBCUTANEOUS TISSUE DISORDERS",
"EAR AND LABYRINTH DISORDERS")) |>
dplyr::filter(.by = AEBODSYS, dplyr::row_number() < 20)
tbl <-
tbl_hierarchical(
data = ADAE_subset,
variables = c(AEBODSYS, AEDECOD),
by = TRTA,
denominator = cards::ADSL |> mutate(TRTA = ARM),
id = USUBJID,
overall_row = TRUE
)
# Example 1 ----------------------------------
# Keep rows where less than 2 AEs are observed across the row
filter_hierarchical(tbl, sum(n) < 2)
# Example 2 ----------------------------------
# Keep rows where at least one treatment group in the row has at least 2 AEs observed
filter_hierarchical(tbl, n >= 2)
# Example 3 ----------------------------------
# Keep rows where AEs across the row have an overall prevalence of greater than 0.5%
filter_hierarchical(tbl, sum(n) / sum(N) > 0.005)
Extract ARDs
Description
Extract the ARDs from a gtsummary table. If needed, results may be combined
with cards::bind_ard()
.
Usage
gather_ard(x)
Arguments
x |
( |
Value
list
Examples
tbl_summary(trial, by = trt, include = age) |>
add_overall() |>
add_p() |>
gather_ard()
glm(response ~ trt, data = trial, family = binomial()) |>
tbl_regression() |>
gather_ard()
Default glance function
Description
This is an S3 generic used as the default function in add_glance*(glance_fun)
.
It's provided so various regression model classes can have their own default
functions for returning statistics.
Usage
glance_fun_s3(x, ...)
## Default S3 method:
glance_fun_s3(x, ...)
## S3 method for class 'mira'
glance_fun_s3(x, ...)
Arguments
x |
(regression model) |
... |
These dots are for future extensions and must be empty. |
Value
a function
Examples
mod <- lm(age ~ trt, trial)
glance_fun_s3(mod)
Global p-value generic
Description
An S3 generic that serves as the default for add_global_p(anova_fun)
.
The default function uses car::Anova()
(via cardx::ard_car_anova()
) to
calculate the p-values.
The method for GEE models (created from geepack::geeglm()
) returns Wald tests calculated
using aod::wald.test()
(via cardx::ard_aod_wald_test()
). For this method,
the type
argument is not used.
Usage
global_pvalue_fun(x, type, ...)
## Default S3 method:
global_pvalue_fun(x, type, ...)
## S3 method for class 'geeglm'
global_pvalue_fun(x, type, ...)
Value
data frame
Examples
lm(age ~ stage + grade, trial) |>
global_pvalue_fun(type = "III")
Report statistics from gtsummary tables inline
Description
Report statistics from gtsummary tables inline
Usage
inline_text(x, ...)
Arguments
x |
( |
... |
Additional arguments passed to other methods. |
Value
A string reporting results from a gtsummary table
Author(s)
Daniel D. Sjoberg
See Also
inline_text.tbl_summary()
, inline_text.tbl_svysummary()
,
inline_text.tbl_regression()
, inline_text.tbl_uvregression()
,
inline_text.tbl_survfit()
, inline_text.tbl_cross()
, inline_text.gtsummary()
Report statistics from summary tables inline
Description
Report statistics from summary tables inline
Usage
## S3 method for class 'gtsummary'
inline_text(x, variable, level = NULL, column = NULL, pattern = NULL, ...)
Arguments
x |
( |
variable |
( |
level |
( |
column |
( |
pattern |
( |
... |
These dots are for future extensions and must be empty. |
Value
A string
column + pattern
Some gtsummary tables report multiple statistics in a single cell,
e.g. "{mean} ({sd})"
in tbl_summary()
or tbl_svysummary()
.
We often need to report just the mean or the SD, and that can be accomplished
by using both the column=
and pattern=
arguments. When both of these
arguments are specified, the column argument selects the column to report
statistics from, and the pattern argument specifies which statistics to report,
e.g. inline_text(x, column = "stat_1", pattern = "{mean}")
reports just the
mean from a tbl_summary()
. This is not supported for all tables.
Report statistics from summary tables inline
Description
Extracts and returns statistics from a tbl_continuous()
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
Usage
## S3 method for class 'tbl_continuous'
inline_text(
x,
variable,
column = NULL,
level = NULL,
pattern = NULL,
pvalue_fun = label_style_pvalue(prepend_p = TRUE),
...
)
Arguments
x |
( |
variable |
( |
column |
( |
level |
( |
pattern |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
Value
A string reporting results from a gtsummary table
Author(s)
Daniel D. Sjoberg
Examples
t1 <- trial |>
tbl_summary(by = trt, include = grade) |>
add_p()
inline_text(t1, variable = grade, level = "I", column = "Drug A", pattern = "{n}/{N} ({p}%)")
inline_text(t1, variable = grade, column = "p.value")
Report statistics from cross table inline
Description
Extracts and returns statistics from a tbl_cross
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
Usage
## S3 method for class 'tbl_cross'
inline_text(
x,
col_level,
row_level = NULL,
pvalue_fun = label_style_pvalue(prepend_p = TRUE),
...
)
Arguments
x |
( |
col_level |
( |
row_level |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
Value
A string reporting results from a gtsummary table
Examples
tbl_cross <-
tbl_cross(trial, row = trt, col = response) %>%
add_p()
inline_text(tbl_cross, row_level = "Drug A", col_level = "1")
inline_text(tbl_cross, row_level = "Total", col_level = "1")
inline_text(tbl_cross, col_level = "p.value")
Report statistics from regression summary tables inline
Description
Takes an object with class tbl_regression
, and the
location of the statistic to report and returns statistics for reporting
inline in an R markdown document. Detailed examples in the
inline_text vignette
Usage
## S3 method for class 'tbl_regression'
inline_text(
x,
variable,
level = NULL,
pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})",
estimate_fun = x$inputs$estimate_fun,
pvalue_fun = label_style_pvalue(prepend_p = TRUE),
...
)
Arguments
x |
( |
variable |
( |
level |
( |
pattern |
( |
estimate_fun |
( |
pvalue_fun |
function to style p-values and/or q-values.
Default is |
... |
These dots are for future extensions and must be empty. |
Value
A string reporting results from a gtsummary table
pattern argument
The following items (and more) are available to print. Use print(x$table_body)
to
print the table the estimates are extracted from.
-
{estimate}
coefficient estimate formatted with 'estimate_fun' -
{conf.low}
lower limit of confidence interval formatted with 'estimate_fun' -
{conf.high}
upper limit of confidence interval formatted with 'estimate_fun' -
{p.value}
p-value formatted with 'pvalue_fun' -
{N}
number of observations in model -
{label}
variable/variable level label
Author(s)
Daniel D. Sjoberg
Examples
inline_text_ex1 <-
glm(response ~ age + grade, trial, family = binomial(link = "logit")) %>%
tbl_regression(exponentiate = TRUE)
inline_text(inline_text_ex1, variable = age)
inline_text(inline_text_ex1, variable = grade, level = "III")
Report statistics from summary tables inline
Description
Extracts and returns statistics from a tbl_summary()
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
Usage
## S3 method for class 'tbl_summary'
inline_text(
x,
variable,
column = NULL,
level = NULL,
pattern = NULL,
pvalue_fun = label_style_pvalue(prepend_p = TRUE),
...
)
## S3 method for class 'tbl_svysummary'
inline_text(
x,
variable,
column = NULL,
level = NULL,
pattern = NULL,
pvalue_fun = label_style_pvalue(prepend_p = TRUE),
...
)
Arguments
x |
( |
variable |
( |
column |
( |
level |
( |
pattern |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
Value
A string reporting results from a gtsummary table
Author(s)
Daniel D. Sjoberg
Examples
t1 <- trial |>
tbl_summary(by = trt, include = grade) |>
add_p()
inline_text(t1, variable = grade, level = "I", column = "Drug A", pattern = "{n}/{N} ({p}%)")
inline_text(t1, variable = grade, column = "p.value")
Report statistics from survfit tables inline
Description
Extracts and returns statistics from a tbl_survfit
object for
inline reporting in an R markdown document. Detailed examples in the
inline_text vignette
Usage
## S3 method for class 'tbl_survfit'
inline_text(
x,
variable = NULL,
level = NULL,
pattern = NULL,
time = NULL,
prob = NULL,
column = NULL,
estimate_fun = x$inputs$estimate_fun,
pvalue_fun = label_style_pvalue(prepend_p = TRUE),
...
)
Arguments
x |
( |
variable |
( |
level |
( |
pattern |
( |
time , prob |
( |
column |
( |
estimate_fun |
( |
pvalue_fun |
( |
... |
These dots are for future extensions and must be empty. |
Value
A string reporting results from a gtsummary table
Author(s)
Daniel D. Sjoberg
Examples
library(survival)
# fit survfit
fit1 <- survfit(Surv(ttdeath, death) ~ trt, trial)
fit2 <- survfit(Surv(ttdeath, death) ~ 1, trial)
# sumarize survfit objects
tbl1 <-
tbl_survfit(
fit1,
times = c(12, 24),
label = ~"Treatment",
label_header = "**{time} Month**"
) %>%
add_p()
tbl2 <-
tbl_survfit(
fit2,
probs = 0.5,
label_header = "**Median Survival**"
)
# report results inline
inline_text(tbl1, time = 24, level = "Drug B")
inline_text(tbl1, time = 24, level = "Drug B",
pattern = "{estimate} [95% CI {conf.low}, {conf.high}]")
inline_text(tbl1, column = p.value)
inline_text(tbl2, prob = 0.5)
Report statistics from regression summary tables inline
Description
Extracts and returns statistics from a table created by the tbl_uvregression
function for inline reporting in an R markdown document.
Detailed examples in the
inline_text vignette
Usage
## S3 method for class 'tbl_uvregression'
inline_text(
x,
variable,
level = NULL,
pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})",
estimate_fun = x$inputs$estimate_fun,
pvalue_fun = label_style_pvalue(prepend_p = TRUE),
...
)
Arguments
x |
( |
variable |
( |
level |
( |
pattern |
( |
estimate_fun |
( |
pvalue_fun |
function to style p-values and/or q-values.
Default is |
... |
These dots are for future extensions and must be empty. |
Value
A string reporting results from a gtsummary table
pattern argument
The following items (and more) are available to print. Use print(x$table_body)
to
print the table the estimates are extracted from.
-
{estimate}
coefficient estimate formatted with 'estimate_fun' -
{conf.low}
lower limit of confidence interval formatted with 'estimate_fun' -
{conf.high}
upper limit of confidence interval formatted with 'estimate_fun' -
{p.value}
p-value formatted with 'pvalue_fun' -
{N}
number of observations in model -
{label}
variable/variable level label
Author(s)
Daniel D. Sjoberg
Examples
inline_text_ex1 <-
trial[c("response", "age", "grade")] %>%
tbl_uvregression(
method = glm,
method.args = list(family = binomial),
y = response,
exponentiate = TRUE
)
inline_text(inline_text_ex1, variable = age)
inline_text(inline_text_ex1, variable = grade, level = "III")
Is a date/time
Description
is_date_time()
: Predicate for date, time, or date-time vector identification.
Usage
is_date_time(x)
Arguments
x |
a vector |
Value
a scalar logical
Examples
iris |>
dplyr::mutate(date = as.Date("2000-01-01") + dplyr::row_number()) |>
lapply(gtsummary:::is_date_time)
Special Character Escape
Description
These utility functions were copied from the internals of kableExtra,
and assist in escaping special characters in LaTeX and HTML tables.
These function assist in the creations of tables via as_kable_extra()
.
Usage
.escape_html(x)
.escape_latex(x, newlines = TRUE, align = "c")
.escape_latex2(x, newlines = TRUE, align = "c")
Arguments
x |
character vector |
Value
character vector with escaped special characters
See Also
as_kable_extra()
Examples
.escape_latex(c("%", "{test}"))
.escape_html(c(">0.9", "line\nbreak"))
Style Functions
Description
Similar to the style_*()
family of functions, but these functions return
a style_*()
function rather than performing the styling.
Usage
label_style_number(
digits = 0,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
scale = 1,
prefix = "",
suffix = "",
na = NA_character_,
...
)
label_style_sigfig(
digits = 2,
scale = 1,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
prefix = "",
suffix = "",
na = NA_character_,
...
)
label_style_pvalue(
digits = 1,
prepend_p = FALSE,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
na = NA_character_,
...
)
label_style_ratio(
digits = 2,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
prefix = "",
suffix = "",
na = NA_character_,
...
)
label_style_percent(
prefix = "",
suffix = "",
digits = 0,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
na = NA_character_,
...
)
Arguments
digits , big.mark , decimal.mark , scale , prepend_p , prefix , suffix , na , ... |
arguments
passed to the |
Value
a function
See Also
Other style tools:
style_sigfig()
Examples
my_style <- label_style_number(digits = 1)
my_style(3.14)
Modify column headers, footnotes, and spanning headers
Description
These functions assist with modifying the aesthetics/style of a table.
-
modify_header()
update column headers -
modify_spanning_header()
update/add spanning headers
The functions often require users to know the underlying column names.
Run show_header_names()
to print the column names to the console.
Usage
modify_header(x, ..., text_interpret = c("md", "html"), quiet, update)
modify_spanning_header(
x,
...,
text_interpret = c("md", "html"),
level = 1L,
quiet,
update
)
remove_spanning_header(x, columns = everything(), level = 1L)
show_header_names(x, show_hidden = FALSE, include_example, quiet)
Arguments
x |
( |
... |
Use Use the |
text_interpret |
( |
update , quiet |
|
level |
( |
columns |
( |
(scalar | |
include_example |
Value
Updated gtsummary object
tbl_summary()
, tbl_svysummary()
, and tbl_cross()
When assigning column headers and spanning headers,
you may use {N}
to insert the number of observations.
tbl_svysummary
objects additionally have {N_unweighted}
available.
When there is a stratifying by=
argument present, the following fields are
additionally available to stratifying columns: {level}
, {n}
, and {p}
({n_unweighted}
and {p_unweighted}
for tbl_svysummary
objects)
Syntax follows glue::glue()
, e.g. all_stat_cols() ~ "**{level}**, N = {n}"
.
tbl_regression()
When assigning column headers for tbl_regression
tables,
you may use {N}
to insert the number of observations, and {N_event}
for the number of events (when applicable).
Author(s)
Daniel D. Sjoberg
Examples
# create summary table
tbl <- trial |>
tbl_summary(by = trt, missing = "no", include = c("age", "grade", "trt")) |>
add_p()
# print the column names that can be modified
show_header_names(tbl)
# Example 1 ----------------------------------
# updating column headers
tbl |>
modify_header(label = "**Variable**", p.value = "**P**")
# Example 2 ----------------------------------
# updating headers add spanning header
tbl |>
modify_header(all_stat_cols() ~ "**{level}**, N = {n} ({style_percent(p)}%)") |>
modify_spanning_header(all_stat_cols() ~ "**Treatment Received**")
Modify Abbreviations
Description
All abbreviations will be coalesced when printing the final table into a single source note.
Usage
modify_abbreviation(x, abbreviation, text_interpret = c("md", "html"))
remove_abbreviation(x, abbreviation = NULL)
Arguments
x |
( |
abbreviation |
( |
text_interpret |
( |
Value
Updated gtsummary object
Examples
# Example 1 ----------------------------------
tbl_summary(
trial,
by = trt,
include = age,
type = age ~ "continuous2"
) |>
modify_table_body(~dplyr::mutate(.x, label = sub("Q1, Q3", "IQR", x = label))) |>
modify_abbreviation("IQR = Interquartile Range")
# Example 2 ----------------------------------
lm(marker ~ trt, trial) |>
tbl_regression() |>
remove_abbreviation("CI = Confidence Interval")
Modify Bold and Italic
Description
Add or remove bold and italic styling to a cell in a table. By default, the remove functions will remove all bold/italic styling.
Usage
modify_bold(x, columns, rows)
remove_bold(x, columns = everything(), rows = TRUE)
modify_italic(x, columns, rows)
remove_italic(x, columns = everything(), rows = TRUE)
Arguments
x |
( |
columns |
( |
rows |
(predicate |
Value
Updated gtsummary object
Examples
# Example 1 ----------------------------------
tbl <- trial |>
tbl_summary(include = grade) |>
modify_bold(columns = label, rows = row_type == "label") |>
modify_italic(columns = label, rows = row_type == "level")
tbl
# Example 2 ----------------------------------
tbl |>
remove_bold(columns = label, rows = row_type == "label") |>
remove_italic(columns = label, rows = row_type == "level")
Modify table caption
Description
Captions are assigned based on output type.
-
gt::gt(caption=)
-
flextable::set_caption(caption=)
-
huxtable::set_caption(value=)
-
knitr::kable(caption=)
Usage
modify_caption(x, caption, text_interpret = c("md", "html"))
Arguments
x |
( |
caption |
( |
text_interpret |
( |
Value
Updated gtsummary object
Examples
trial |>
tbl_summary(by = trt, include = c(marker, stage)) |>
modify_caption(caption = "**Baseline Characteristics** N = {N}")
Modify column alignment
Description
Update column alignment/justification in a gtsummary table.
Usage
modify_column_alignment(x, columns, align = c("left", "right", "center"))
Arguments
x |
( |
columns |
( |
align |
( |
Examples
# Example 1 ----------------------------------
lm(age ~ marker + grade, trial) %>%
tbl_regression() %>%
modify_column_alignment(columns = everything(), align = "left")
Modify hidden columns
Description
Use these functions to hide or unhide columns in a gtsummary table.
Use show_header_names(show_hidden=TRUE)
to print available columns to update.
Usage
modify_column_hide(x, columns)
modify_column_unhide(x, columns)
Arguments
x |
( |
columns |
( |
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
# hide 95% CI, and replace with standard error
lm(age ~ marker + grade, trial) |>
tbl_regression() |>
modify_column_hide(conf.low) |>
modify_column_unhide(columns = std.error)
Modify Column Merging
Description
Merge two or more columns in a gtsummary table.
Use show_header_names()
to print underlying column names.
Usage
modify_column_merge(x, pattern, rows = NULL)
remove_column_merge(x, columns = everything())
Arguments
x |
( |
pattern |
( |
rows |
(predicate |
columns |
( |
Value
gtsummary table
Details
Calling this function merely records the instructions to merge columns. The actual merging occurs when the gtsummary table is printed or converted with a function like
as_gt()
.Because the column merging is delayed, it is recommended to perform major modifications to the table, such as those with
tbl_merge()
andtbl_stack()
, before assigning merging instructions. Otherwise, unexpected formatting may occur in the final table.If this functionality is used in conjunction with
tbl_stack()
(which includestbl_uvregression()
), there may be potential issues with printing. When columns are stack AND when the column-merging is defined with a quosure, you may run into issues due to the loss of the environment when 2 or more quosures are combined. If the expression version of the quosure is the same as the quosure (i.e. no evaluated objects), there should be no issues.
This function is used internally with care, and it is not recommended for users.
Future Updates
There are planned updates to the implementation of this function
with respect to the pattern=
argument.
Currently, this function replaces a numeric column with a
formatted character column following pattern=
.
Once gt::cols_merge()
gains the rows=
argument the
implementation will be updated to use it, which will keep
numeric columns numeric. For the vast majority of users,
the planned change will be go unnoticed.
See Also
Other Advanced modifiers:
modify_indent()
,
modify_table_styling()
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(by = trt, missing = "no", include = c(age, marker, trt)) |>
add_p(all_continuous() ~ "t.test", pvalue_fun = label_style_pvalue(prepend_p = TRUE)) |>
modify_fmt_fun(statistic ~ label_style_sigfig()) |>
modify_column_merge(pattern = "t = {statistic}; {p.value}") |>
modify_header(statistic = "**t-test**")
# Example 2 ----------------------------------
lm(marker ~ age + grade, trial) |>
tbl_regression() |>
modify_column_merge(
pattern = "{estimate} ({conf.low}, {conf.high})",
rows = !is.na(estimate)
)
Modify formatting functions
Description
Use this function to update the way numeric columns and rows of .$table_body
are formatted
Usage
modify_fmt_fun(x, ..., rows = NULL, update, quiet)
Arguments
x |
( |
... |
Use Use the |
rows |
(predicate |
update , quiet |
rows argument
The rows argument accepts a predicate expression that is used to specify
rows to apply formatting. The expression must evaluate to a logical when
evaluated in x$table_body
. For example, to apply formatting to the age rows
pass rows = variable == "age"
. A vector of row numbers is NOT acceptable.
A couple of things to note when using the rows
argument.
You can use saved objects to create the predicate argument, e.g.
rows = variable == letters[1]
.The saved object cannot share a name with a column in
x$table_body
. The reason for this is that intbl_merge()
the columns are renamed, and the renaming process cannot disambiguate thevariable
column from an external object namedvariable
in the following expressionrows = .data$variable = .env$variable
.
Examples
# Example 1 ----------------------------------
# show 'grade' p-values to 3 decimal places and estimates to 4 sig figs
lm(age ~ marker + grade, trial) |>
tbl_regression() %>%
modify_fmt_fun(
p.value = label_style_pvalue(digits = 3),
c(estimate, conf.low, conf.high) ~ label_style_sigfig(digits = 4),
rows = variable == "grade"
)
Modify Footnotes
Description
Modify Footnotes
Usage
modify_footnote_header(
x,
footnote,
columns,
replace = TRUE,
text_interpret = c("md", "html")
)
modify_footnote_body(
x,
footnote,
columns,
rows,
replace = TRUE,
text_interpret = c("md", "html")
)
modify_footnote_spanning_header(
x,
footnote,
columns,
level = 1L,
replace = TRUE,
text_interpret = c("md", "html")
)
remove_footnote_header(x, columns = everything())
remove_footnote_body(x, columns = everything(), rows = TRUE)
remove_footnote_spanning_header(x, columns = everything(), level = 1L)
Arguments
x |
( |
footnote |
( |
columns |
( For |
replace |
(scalar |
text_interpret |
( |
rows |
(predicate |
level |
( |
Value
Updated gtsummary object
Examples
# Example 1 ----------------------------------
tbl <- trial |>
tbl_summary(by = trt, include = c(age, grade), missing = "no") |>
modify_footnote_header(
footnote = "All but four subjects received both treatments in a crossover design",
columns = all_stat_cols(),
replace = FALSE
) |>
modify_footnote_body(
footnote = "Tumor grade was assessed _before_ treatment began",
columns = "label",
rows = variable == "grade" & row_type == "label"
)
tbl
# Example 2 ----------------------------------
# remove all footnotes
tbl |>
remove_footnote_header(columns = all_stat_cols()) |>
remove_footnote_body(columns = label, rows = variable == "grade" & row_type == "label")
Modify column indentation
Description
Add, increase, or reduce indentation for columns.
Usage
modify_indent(x, columns, rows = NULL, indent = 4L, double_indent, undo)
Arguments
x |
( |
columns |
( |
rows |
(predicate |
indent |
( |
double_indent , undo |
Value
a gtsummary table
See Also
Other Advanced modifiers:
modify_column_merge()
,
modify_table_styling()
Examples
# remove indentation from `tbl_summary()`
trial |>
tbl_summary(include = grade) |>
modify_indent(columns = label, indent = 0L)
# increase indentation in `tbl_summary`
trial |>
tbl_summary(include = grade) |>
modify_indent(columns = label, rows = !row_type %in% 'label', indent = 8L)
Modify Missing Substitution
Description
Specify how missing values will be represented in the printed table.
By default, a blank space is printed for all NA
values.
Usage
modify_missing_symbol(x, symbol, columns, rows)
Arguments
x |
( |
symbol |
( |
columns |
( |
rows |
(predicate |
Value
Updated gtsummary object
Examples
# Use the abbreivation "Ref." for reference rows instead of the em-dash
lm(marker ~ trt, data = trial) |>
tbl_regression() |>
modify_missing_symbol(
symbol = "Ref.",
columns = c(estimate, conf.low, conf.high),
rows = reference_row == TRUE
)
Modify post formatting
Description
Apply a formatting function after the primary formatting functions have been applied.
The function is similar to gt::text_transform()
.
Usage
modify_post_fmt_fun(x, fmt_fun, columns, rows = TRUE)
Arguments
x |
( |
fmt_fun |
( |
columns |
( |
rows |
(predicate |
Value
Updated gtsummary object
Examples
# Example 1 ----------------------------------
data.frame(x = FALSE) |>
tbl_summary(type = x ~ "categorical") |>
modify_post_fmt_fun(
fmt_fun = ~ifelse(. == "0 (0%)", "0", .),
columns = all_stat_cols()
)
Modify source note
Description
Add and remove source notes from a table. Source notes are similar to footnotes, expect they are not linked to a cell in the table.
Usage
modify_source_note(x, source_note, text_interpret = c("md", "html"))
remove_source_note(x, source_note_id = NULL)
Arguments
x |
( |
source_note |
( |
text_interpret |
( |
source_note_id |
( |
Details
Source notes are not supported by as_kable_extra()
.
Value
gtsummary object
Examples
# Example 1 ----------------------------------
tbl <- tbl_summary(trial, include = c(marker, grade), missing = "no") |>
modify_source_note("Results as of June 26, 2015")
# Example 2 ----------------------------------
remove_source_note(tbl, source_note_id = 1)
Modify Table Body
Description
Function is for advanced manipulation of gtsummary tables.
It allow users to modify the .$table_body
data frame included
in each gtsummary object.
If a new column is added to the table, default printing instructions will then
be added to .$table_styling
. By default, columns are hidden.
To show a column, add a column header with modify_header()
or call
modify_column_unhide()
.
Usage
modify_table_body(x, fun, ...)
Arguments
x |
( |
fun |
( |
... |
Additional arguments passed on to the function |
Value
A 'gtsummary' object
Examples
# Example 1 --------------------------------
# Add number of cases and controls to regression table
trial |>
tbl_uvregression(
y = response,
include = c(age, marker),
method = glm,
method.args = list(family = binomial),
exponentiate = TRUE,
hide_n = TRUE
) |>
# adding number of non-events to table
modify_table_body(
~ .x %>%
dplyr::mutate(N_nonevent = N_obs - N_event) |>
dplyr::relocate(c(N_event, N_nonevent), .before = estimate)
) |>
# assigning header labels
modify_header(N_nonevent = "**Control N**", N_event = "**Case N**") |>
modify_fmt_fun(c(N_event, N_nonevent) ~ style_number)
Modify Table Styling
Description
This function is for developers. This function has very little checking of the passed arguments, by design.
If you are not a developer, it's recommended that you use the following functions to make modifications to your table:
modify_header()
,
modify_spanning_header()
, modify_column_hide()
, modify_column_unhide()
,
modify_footnote_header()
, modify_footnote_body()
, modify_abbreviation()
,
modify_column_alignment()
, modify_fmt_fun()
, modify_indent()
,
modify_column_merge()
, modify_missing_symbol()
, modify_bold()
,
modify_italic()
.
This is a function provides control over the characteristics of the resulting
gtsummary table by directly modifying .$table_styling
.
Review the
gtsummary definition
vignette for information on .$table_styling
objects.
Usage
modify_table_styling(
x,
columns,
rows = NULL,
label = NULL,
spanning_header = NULL,
hide = NULL,
footnote = NULL,
footnote_abbrev = NULL,
align = NULL,
missing_symbol = NULL,
fmt_fun = NULL,
text_format = NULL,
undo_text_format = NULL,
indent = NULL,
text_interpret = "md",
cols_merge_pattern = NULL
)
Arguments
x |
( |
columns |
( |
rows |
(predicate |
label |
( |
spanning_header |
( |
hide |
(scalar |
footnote |
( |
footnote_abbrev |
( |
align |
( |
missing_symbol |
( |
fmt_fun |
( |
text_format , undo_text_format |
( |
indent |
( |
text_interpret |
( |
cols_merge_pattern |
( |
rows argument
The rows argument accepts a predicate expression that is used to specify
rows to apply formatting. The expression must evaluate to a logical when
evaluated in x$table_body
. For example, to apply formatting to the age rows
pass rows = variable == "age"
. A vector of row numbers is NOT acceptable.
A couple of things to note when using the rows
argument.
You can use saved objects to create the predicate argument, e.g.
rows = variable == letters[1]
.The saved object cannot share a name with a column in
x$table_body
. The reason for this is that intbl_merge()
the columns are renamed, and the renaming process cannot disambiguate thevariable
column from an external object namedvariable
in the following expressionrows = .data$variable = .env$variable
.
cols_merge_pattern argument
There are planned updates to the implementation of column merging.
Currently, this function replaces the numeric column with a
formatted character column following cols_merge_pattern=
.
Once gt::cols_merge()
gains the rows=
argument the
implementation will be updated to use it, which will keep
numeric columns numeric. For the vast majority of users,
the planned change will be go unnoticed.
If this functionality is used in conjunction with tbl_stack()
(which
includes tbl_uvregression()
), there is potential issue with printing.
When columns are stack AND when the column-merging is
defined with a quosure, you may run into issues due to the loss of the
environment when 2 or more quosures are combined. If the expression
version of the quosure is the same as the quosure (i.e. no evaluated
objects), there should be no issues. Regardless, this argument is used
internally with care, and it is not recommended for users.
See Also
See gtsummary internals vignette
Other Advanced modifiers:
modify_column_merge()
,
modify_indent()
Plot Regression Coefficients
Description
The plot()
function extracts x$table_body
and passes the it to
ggstats::ggcoef_plot()
along with formatting options.
Usage
## S3 method for class 'tbl_regression'
plot(x, remove_header_rows = TRUE, remove_reference_rows = FALSE, ...)
## S3 method for class 'tbl_uvregression'
plot(x, remove_header_rows = TRUE, remove_reference_rows = FALSE, ...)
Arguments
x |
( |
remove_header_rows |
(scalar |
remove_reference_rows |
(scalar |
... |
arguments passed to |
Details
Value
a ggplot
Examples
glm(response ~ marker + grade, trial, family = binomial) |>
tbl_regression(
add_estimate_to_reference_rows = TRUE,
exponentiate = TRUE
) |>
plot()
print and knit_print methods for gtsummary objects
Description
print and knit_print methods for gtsummary objects
Usage
## S3 method for class 'gtsummary'
print(
x,
print_engine = c("gt", "flextable", "huxtable", "kable", "kable_extra", "tibble"),
...
)
## S3 method for class 'gtsummary'
knit_print(
x,
print_engine = c("gt", "flextable", "huxtable", "kable", "kable_extra", "tibble"),
...
)
pkgdown_print.gtsummary(x, visible = TRUE)
Arguments
x |
An object created using gtsummary functions |
print_engine |
String indicating the print method. Must be one of
|
... |
Not used |
Author(s)
Daniel D. Sjoberg
Summarize a proportion
Description
This helper, to be used with tbl_custom_summary()
, creates a function
computing a proportion and its confidence interval.
Usage
proportion_summary(
variable,
value,
weights = NULL,
na.rm = TRUE,
conf.level = 0.95,
method = c("wilson", "wilson.no.correct", "wald", "wald.no.correct", "exact",
"agresti.coull", "jeffreys")
)
Arguments
variable |
( |
value |
( |
weights |
( |
na.rm |
(scalar |
conf.level |
(scalar |
method |
( |
Details
Computed statistics:
-
{n}
numerator, number of observations equal tovalues
-
{N}
denominator, number of observations -
{prop}
proportion, i.e.n/N
-
{conf.low}
lower confidence interval -
{conf.high}
upper confidence interval
Methods c("wilson", "wilson.no.correct")
are calculated with
stats::prop.test()
(with correct = c(TRUE, FALSE)
). The default method,
"wilson"
, includes the Yates continuity correction.
Methods c("exact", "asymptotic")
are calculated with Hmisc::binconf()
and the corresponding method.
Author(s)
Joseph Larmarange
Examples
# Example 1 ----------------------------------
Titanic |>
as.data.frame() |>
tbl_custom_summary(
include = c("Age", "Class"),
by = "Sex",
stat_fns = ~ proportion_summary("Survived", "Yes", weights = "Freq"),
statistic = ~ "{prop}% ({n}/{N}) [{conf.low}-{conf.high}]",
digits = ~ list(
prop = label_style_percent(digits = 1),
n = 0,
N = 0,
conf.low = label_style_percent(),
conf.high = label_style_percent()
),
overall_row = TRUE,
overall_row_last = TRUE
) |>
bold_labels() |>
modify_footnote_header("Proportion (%) of survivors (n/N) [95% CI]", columns = all_stat_cols())
Summarize the ratio of two variables
Description
This helper, to be used with tbl_custom_summary()
, creates a function
computing the ratio of two continuous variables and its confidence interval.
Usage
ratio_summary(numerator, denominator, na.rm = TRUE, conf.level = 0.95)
Arguments
numerator |
( |
denominator |
( |
na.rm |
(scalar |
conf.level |
(scalar |
Details
Computed statistics:
-
{num}
sum of the variable defined bynumerator
-
{denom}
sum of the variable defined bydenominator
-
{ratio}
ratio ofnum
bydenom
-
{conf.low}
lower confidence interval -
{conf.high}
upper confidence interval
Confidence interval is computed with stats::poisson.test()
, if and only if
num
is an integer.
Author(s)
Joseph Larmarange
Examples
# Example 1 ----------------------------------
trial |>
tbl_custom_summary(
include = c("stage", "grade"),
by = "trt",
stat_fns = ~ ratio_summary("response", "ttdeath"),
statistic = ~"{ratio} [{conf.low}; {conf.high}] ({num}/{denom})",
digits = ~ c(ratio = 3, conf.low = 2, conf.high = 2),
overall_row = TRUE,
overall_row_label = "All stages & grades"
) |>
bold_labels() |>
modify_footnote_header("Ratio [95% CI] (n/N)", columns = all_stat_cols())
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
- dplyr
%>%
,all_of
,any_of
,as_tibble
,contains
,ends_with
,everything
,last_col
,matches
,mutate
,num_range
,one_of
,select
,starts_with
,vars
,where
Remove rows
Description
Removes either the header, reference, or missing rows from a gtsummary table.
Usage
remove_row_type(
x,
variables = everything(),
type = c("header", "reference", "missing", "level", "all"),
level_value = NULL
)
Arguments
x |
( |
variables |
( |
type |
( |
level_value |
( |
Value
Modified gtsummary table
Examples
# Example 1 ----------------------------------
trial |>
dplyr::mutate(
age60 = ifelse(age < 60, "<60", "60+")
) |>
tbl_summary(by = trt, missing = "no", include = c(trt, age, age60)) |>
remove_row_type(age60, type = "header")
rows
argument
Description
The rows argument accepts a predicate expression that is used to specify
rows to apply formatting. The expression must evaluate to a logical when
evaluated in x$table_body
. For example, to apply formatting to the age rows
pass rows = variable == "age"
. A vector of row numbers is NOT acceptable.
The x$table_body
contains columns that are hidden in the final print of
a table that are often useful for defining these expressions; print the table
to view all column available.
A couple of things to note when using the rows
argument.
You can use saved objects to create the predicate argument, e.g.
rows = variable == letters[1]
.The saved object cannot share a name with a column in
x$table_body
. The reason for this is that intbl_merge()
the columns are renamed, and the renaming process cannot disambiguate thevariable
column from an external object namedvariable
in the following expressionrows = .data$variable == .env$variable
.
Scoping for Table Body and Header
Description
scope_table_body()
This function uses the information in .$table_body
and adds them
as attributes to data
(if passed). Once they've been assigned as
proper gtsummary attributes, gtsummary selectors like all_continuous()
will work properly.
Columns c("var_type", "test_name", "contrasts_type")
and columns that
begin with "selector_*"
are scoped. The values of these columns are
added as attributes to a data frame. For example, if var_type='continuous'
for variable "age"
, then the attribute
attr(.$age, 'gtsummary.var_type') <- 'continuous'
is set.
That attribute is then used in a selector like all_continuous()
.
scope_header()
This function takes information from .$table_styling$header
and adds it
to table_body
. Columns that begin with 'modify_selector_'
and the hide
column.
Usage
scope_table_body(table_body, data = NULL)
scope_header(table_body, header = NULL)
Arguments
table_body |
a data frame from |
data |
an optional data frame the attributes will be added to |
header |
the header data frame from |
Value
a data frame
Examples
tbl <- tbl_summary(trial, include = c(age, grade))
scope_table_body(tbl$table_body) |> select(all_continuous()) |> names()
Select helper functions
Description
Set of functions to supplement the {tidyselect} set of functions for selecting columns of data frames (and other items as well).
-
all_continuous()
selects continuous variables -
all_continuous2()
selects only type"continuous2"
-
all_categorical()
selects categorical (including"dichotomous"
) variables -
all_dichotomous()
selects only type"dichotomous"
-
all_tests()
selects variables by the name of the test performed -
all_stat_cols()
selects columns fromtbl_summary
/tbl_svysummary
object with summary statistics (i.e."stat_0"
,"stat_1"
,"stat_2"
, etc.) -
all_interaction()
selects interaction terms from a regression model -
all_intercepts()
selects intercept terms from a regression model -
all_contrasts()
selects variables in regression model based on their type of contrast
Usage
all_continuous(continuous2 = TRUE)
all_continuous2()
all_categorical(dichotomous = TRUE)
all_dichotomous()
all_tests(tests)
all_intercepts()
all_interaction()
all_contrasts(
contrasts_type = c("treatment", "sum", "poly", "helmert", "sdif", "other")
)
all_stat_cols(stat_0 = TRUE)
Arguments
continuous2 |
(scalar |
dichotomous |
(scalar |
tests |
( |
contrasts_type |
( |
stat_0 |
(scalar |
Value
A character vector of column names selected
See Also
Review list, formula, and selector syntax used throughout gtsummary
Examples
select_ex1 <-
trial |>
select(age, response, grade) |>
tbl_summary(
statistic = all_continuous() ~ "{mean} ({sd})",
type = all_dichotomous() ~ "categorical"
)
Create footnotes for individual p-values
Description
The usual presentation of footnotes for p-values on a gtsummary table is
to have a single footnote that lists all statistical tests that were used to
compute p-values on a given table. The separate_p_footnotes()
function
separates aggregated p-value footnotes to individual footnotes that denote
the specific test used for each of the p-values.
Usage
separate_p_footnotes(x)
Arguments
x |
( |
Examples
# Example 1 ----------------------------------
trial |>
tbl_summary(by = trt, include = c(age, grade)) |>
add_p() |>
separate_p_footnotes()
Set gtsummary theme
Description
Functions to set, reset, get, and evaluate with gtsummary themes.
-
set_gtsummary_theme()
set a theme -
reset_gtsummary_theme()
reset themes -
get_gtsummary_theme()
get a named list with all active theme elements -
with_gtsummary_theme()
evaluate an expression with a theme temporarily set -
check_gtsummary_theme()
checks if passed theme is valid
Usage
set_gtsummary_theme(x, quiet)
reset_gtsummary_theme()
get_gtsummary_theme()
with_gtsummary_theme(
x,
expr,
env = rlang::caller_env(),
msg_ignored_elements = NULL
)
check_gtsummary_theme(x)
Arguments
x |
(named |
quiet |
|
expr |
( |
env |
( |
msg_ignored_elements |
( |
Details
The default formatting and styling throughout the gtsummary package are taken from the published reporting guidelines of the top four urology journals: European Urology, The Journal of Urology, Urology and the British Journal of Urology International. Use this function to change the default reporting style to match another journal, or your own personal style.
See Also
Available gtsummary themes
Examples
# Setting JAMA theme for gtsummary
set_gtsummary_theme(theme_gtsummary_journal("jama"))
# Themes can be combined by including more than one
set_gtsummary_theme(theme_gtsummary_compact())
set_gtsummary_theme_ex1 <-
trial |>
tbl_summary(by = trt, include = c(age, grade, trt)) |>
add_stat_label() |>
as_gt()
# reset gtsummary theme
reset_gtsummary_theme()
Sort/filter by p-values
Description
Sort/filter by p-values
Usage
sort_p(x, q = FALSE)
filter_p(x, q = FALSE, t = 0.05)
Arguments
x |
( |
q |
(scalar |
t |
(scalar |
Author(s)
Karissa Whiting, Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
trial %>%
select(age, grade, response, trt) %>%
tbl_summary(by = trt) %>%
add_p() %>%
filter_p(t = 0.8) %>%
sort_p()
# Example 2 ----------------------------------
glm(response ~ trt + grade, trial, family = binomial(link = "logit")) %>%
tbl_regression(exponentiate = TRUE) %>%
sort_p()
Sort Hierarchical Tables
Description
This function is used to sort hierarchical tables. Options for sorting criteria are:
Descending - within each section of the hierarchy table, event rate sums are calculated for each row and rows are sorted in descending order by sum (default).
Alphanumeric - rows are ordered alphanumerically (i.e. A to Z) by label text. By default,
tbl_hierarchical()
sorts tables in alphanumeric order.
Usage
sort_hierarchical(x, ...)
## S3 method for class 'tbl_hierarchical'
sort_hierarchical(x, sort = c("descending", "alphanumeric"), ...)
## S3 method for class 'tbl_hierarchical_count'
sort_hierarchical(x, sort = c("descending", "alphanumeric"), ...)
Arguments
x |
( |
... |
These dots are for future extensions and must be empty. |
sort |
(
Defaults to |
Value
A gtsummary
of the same class as x
.
See Also
Examples
theme_gtsummary_compact()
ADAE_subset <- cards::ADAE |>
dplyr::filter(AEBODSYS %in% c("SKIN AND SUBCUTANEOUS TISSUE DISORDERS",
"EAR AND LABYRINTH DISORDERS")) |>
dplyr::filter(.by = AEBODSYS, dplyr::row_number() < 20)
tbl <-
tbl_hierarchical(
data = ADAE_subset,
variables = c(AEBODSYS, AEDECOD),
by = TRTA,
denominator = cards::ADSL |> mutate(TRTA = ARM),
id = USUBJID,
overall_row = TRUE
) |>
add_overall()
# Example 1 - Descending Frequency Sort ------------------
sort_hierarchical(tbl)
# Example 2 - Alphanumeric Sort --------------------------
sort_hierarchical(tbl, sort = "alphanumeric")
reset_gtsummary_theme()
Style numbers
Description
Style numbers
Usage
style_number(
x,
digits = 0,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
scale = 1,
prefix = "",
suffix = "",
na = NA_character_,
...
)
Arguments
x |
( |
digits |
(non-negative |
big.mark |
( |
decimal.mark |
( |
scale |
(scalar |
prefix |
( |
suffix |
( |
na |
( |
... |
Arguments passed on to |
Value
formatted character vector
Examples
c(0.111, 12.3) |> style_number(digits = 1)
c(0.111, 12.3) |> style_number(digits = c(1, 0))
Style percentages
Description
Style percentages
Usage
style_percent(
x,
digits = 0,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
prefix = "",
suffix = "",
symbol,
na = NA_character_,
...
)
Arguments
x |
numeric vector of percentages |
digits |
number of digits to round large percentages (i.e. greater than 10%).
Smaller percentages are rounded to |
big.mark |
( |
decimal.mark |
( |
prefix |
( |
suffix |
( |
symbol |
Logical indicator to include percent symbol in output.
Default is |
na |
( |
... |
Arguments passed on to |
Value
A character vector of styled percentages
Author(s)
Daniel D. Sjoberg
Examples
percent_vals <- c(-1, 0, 0.0001, 0.005, 0.01, 0.10, 0.45356, 0.99, 1.45)
style_percent(percent_vals)
style_percent(percent_vals, suffix = "%", digits = 1)
Style p-values
Description
Style p-values
Usage
style_pvalue(
x,
digits = 1,
prepend_p = FALSE,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
na = NA_character_,
...
)
Arguments
x |
( |
digits |
( |
prepend_p |
(scalar |
big.mark |
( |
decimal.mark |
( |
na |
( |
... |
Arguments passed on to |
Value
A character vector of styled p-values
Author(s)
Daniel D. Sjoberg
Examples
pvals <- c(
1.5, 1, 0.999, 0.5, 0.25, 0.2, 0.197, 0.12, 0.10, 0.0999, 0.06,
0.03, 0.002, 0.001, 0.00099, 0.0002, 0.00002, -1
)
style_pvalue(pvals)
style_pvalue(pvals, digits = 2, prepend_p = TRUE)
Style ratios
Description
When reporting ratios, such as relative risk or an odds ratio, we'll often
want the rounding to be similar on each side of the number 1. For example,
if we report an odds ratio of 0.95 with a confidence interval of 0.70 to 1.24,
we would want to round to two decimal places for all values. In other words,
2 significant figures for numbers less than 1 and 3 significant figures 1 and
larger. style_ratio()
performs significant figure-like rounding in this manner.
Usage
style_ratio(
x,
digits = 2,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
prefix = "",
suffix = "",
na = NA_character_,
...
)
Arguments
x |
( |
digits |
( |
big.mark |
( |
decimal.mark |
( |
prefix |
( |
suffix |
( |
na |
( |
... |
Arguments passed on to |
Value
A character vector of styled ratios
Author(s)
Daniel D. Sjoberg
Examples
c(0.123, 0.9, 1.1234, 12.345, 101.234, -0.123, -0.9, -1.1234, -12.345, -101.234) |>
style_ratio()
Style significant figure-like rounding
Description
Converts a numeric argument into a string that has been rounded to a significant figure-like number. Scientific notation output is avoided, however, and additional significant figures may be displayed for large numbers. For example, if the number of significant digits requested is 2, 123 will be displayed (rather than 120 or 1.2x10^2).
Usage
style_sigfig(
x,
digits = 2,
scale = 1,
big.mark = ifelse(decimal.mark == ",", " ", ","),
decimal.mark = getOption("OutDec"),
prefix = "",
suffix = "",
na = NA_character_,
...
)
Arguments
x |
Numeric vector |
digits |
Integer specifying the minimum number of significant digits to display |
scale |
(scalar |
big.mark |
( |
decimal.mark |
( |
prefix |
( |
suffix |
( |
na |
( |
... |
Arguments passed on to |
Value
A character vector of styled numbers
Details
Scientific notation output is avoided.
If 2 significant figures are requested, the number is rounded to no more than 2 decimal places. For example, a number will be rounded to 2 decimals places when
abs(x) < 1
, 1 decimal place whenabs(x) >= 1 & abs(x) < 10
, and to the nearest integer whenabs(x) >= 10
.Additional significant figures may be displayed for large numbers. For example, if the number of significant digits requested is 2, 123 will be displayed (rather than 120 or 1.2x10^2).
Author(s)
Daniel D. Sjoberg
See Also
Other style tools:
label_style
Examples
c(0.123, 0.9, 1.1234, 12.345, -0.123, -0.9, -1.1234, -132.345, NA, -0.001) %>%
style_sigfig()
Syntax and Notation
Description
Syntax and Notation
Selectors
The gtsummary package also utilizes selectors: selectors from the tidyselect package and custom selectors. Review their help files for details.
-
tidy selectors
everything()
,all_of()
,any_of()
,starts_with()
,ends_with()
,contains()
,matches()
,num_range()
,last_col()
-
gtsummary selectors
all_continuous()
,all_categorical()
,all_dichotomous()
,all_continuous2()
,all_tests()
,all_stat_cols()
,all_interaction()
,all_intercepts()
,all_contrasts()
Formula and List Selectors
Many arguments throughout the gtsummary package accept list and
formula notation, e.g. tbl_summary(statistic=)
. Below enumerates a few
tips and shortcuts for using the list and formulas.
-
List of Formulas
Typical usage includes a list of formulas, where the LHS is a variable name or a selector.
tbl_summary(statistic = list(age ~ "{mean}", all_categorical() ~ "{n}"))
-
Named List
You may also pass a named list; however, the tidyselect and gtsummary selectors are not supported with this syntax.
tbl_summary(statistic = list(age = "{mean}", response = "{n}"))
-
Hybrid Named List/List of Formulas
Pass a combination of formulas and named elements
tbl_summary(statistic = list(age = "{mean}", all_categorical() ~ "{n}"))
-
Shortcuts
You can pass a single formula, which is equivalent to passing the formula in a list.
tbl_summary(statistic = all_categorical() ~ "{n}")
As a shortcut to select all variables, you can omit the LHS of the formula. The two calls below are equivalent.
tbl_summary(statistic = ~"{n}") tbl_summary(statistic = everything() ~ "{n}")
-
Combination Selectors
Selectors can be combined using the
c()
function.tbl_summary(statistic = c(everything(), -grade) ~ "{n}")
Summarize continuous variable
Description
Summarize a continuous variable by one or more categorical variables
Usage
tbl_ard_continuous(
cards,
variable,
include,
by = NULL,
label = NULL,
statistic = everything() ~ "{median} ({p25}, {p75})",
value = NULL
)
Arguments
cards |
( |
variable |
( |
include |
( |
by |
( |
label |
( |
statistic |
( |
value |
( |
Value
a gtsummary table of class "tbl_ard_summary"
Examples
library(cards)
# Example 1 ----------------------------------
# the primary ARD with the results
ard_continuous(
# the order variables are passed is important for the `by` variable.
# 'trt' is the column stratifying variable and needs to be listed first.
trial, by = c(trt, grade), variables = age
) |>
# adding OPTIONAL information about the summary variables
bind_ard(
# add univariate trt tabulation
ard_categorical(trial, variables = trt),
# add missing and attributes ARD
ard_missing(trial, by = c(trt, grade), variables = age),
ard_attributes(trial, variables = c(trt, grade, age))
) |>
tbl_ard_continuous(by = "trt", variable = "age", include = "grade")
# Example 2 ----------------------------------
# the primary ARD with the results
ard_continuous(trial, by = grade, variables = age) |>
# adding OPTIONAL information about the summary variables
bind_ard(
# add missing and attributes ARD
ard_missing(trial, by = grade, variables = age),
ard_attributes(trial, variables = c(grade, age))
) |>
tbl_ard_continuous(variable = "age", include = "grade")
ARD Hierarchical Table
Description
This is an preview of this function. There will be changes in the coming releases, and changes will not undergo a formal deprecation cycle.
Constructs tables from nested or hierarchical data structures (e.g. adverse events).
Usage
tbl_ard_hierarchical(
cards,
variables,
by = NULL,
include = everything(),
statistic = ~"{n} ({p}%)",
label = NULL
)
Arguments
cards |
( |
variables |
( |
by |
( |
include |
( |
statistic |
( |
label |
( |
Value
a gtsummary table of class "tbl_ard_hierarchical"
Examples
ADAE_subset <- cards::ADAE |>
dplyr::filter(
AESOC %in% unique(cards::ADAE$AESOC)[1:5],
AETERM %in% unique(cards::ADAE$AETERM)[1:5]
)
# Example 1: Event Rates --------------------
# First, build the ARD
ard <-
cards::ard_stack_hierarchical(
data = ADAE_subset,
variables = c(AESOC, AETERM),
by = TRTA,
denominator = cards::ADSL |> mutate(TRTA = ARM),
id = USUBJID
)
# Second, build table from the ARD
tbl_ard_hierarchical(
cards = ard,
variables = c(AESOC, AETERM),
by = TRTA
)
# Example 2: Event Counts -------------------
ard <-
cards::ard_stack_hierarchical_count(
data = ADAE_subset,
variables = c(AESOC, AETERM),
by = TRTA,
denominator = cards::ADSL |> mutate(TRTA = ARM)
)
tbl_ard_hierarchical(
cards = ard,
variables = c(AESOC, AETERM),
by = TRTA,
statistic = ~"{n}"
)
ARD summary table
Description
The tbl_ard_summary()
function tables descriptive statistics for
continuous, categorical, and dichotomous variables.
The functions accepts an ARD object.
Usage
tbl_ard_summary(
cards,
by = NULL,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~
"{n} ({p}%)"),
type = NULL,
label = NULL,
missing = c("no", "ifany", "always"),
missing_text = "Unknown",
missing_stat = "{N_miss}",
include = everything(),
overall = FALSE
)
Arguments
cards |
( |
by |
( |
statistic |
( |
type |
( |
label |
( |
missing , missing_text , missing_stat |
Arguments dictating how and if missing values are presented:
|
include |
( |
overall |
(scalar |
Details
There are three types of additional data that can be included in the ARD to improve the default appearance of the table.
-
Attributes: When attributes are included, the default labels will be the variable labels, when available. Attributes can be included in an ARD with
cards::ard_attributes()
orard_stack(.attributes = TRUE)
. -
Missing: When missing results are included, users can include missing counts or rates for variables with
tbl_ard_summary(missing = c("ifany", "always"))
. The missing statistics can be included in an ARD withcards::ard_missing()
orard_stack(.missing = TRUE)
. -
Total N: The total N is saved internally when available, and it can be calculated with
cards::ard_total_n()
orard_stack(.total_n = TRUE)
.
Value
a gtsummary table of class "tbl_ard_summary"
Examples
library(cards)
ard_stack(
data = ADSL,
ard_categorical(variables = "AGEGR1"),
ard_continuous(variables = "AGE"),
.attributes = TRUE,
.missing = TRUE,
.total_n = TRUE
) |>
tbl_ard_summary()
ard_stack(
data = ADSL,
.by = ARM,
ard_categorical(variables = "AGEGR1"),
ard_continuous(variables = "AGE"),
.attributes = TRUE,
.missing = TRUE,
.total_n = TRUE
) |>
tbl_ard_summary(by = ARM)
ard_stack(
data = ADSL,
.by = ARM,
ard_categorical(variables = "AGEGR1"),
ard_continuous(variables = "AGE"),
.attributes = TRUE,
.missing = TRUE,
.total_n = TRUE,
.overall = TRUE
) |>
tbl_ard_summary(by = ARM, overall = TRUE)
Wide ARD summary table
Description
This function is similar to tbl_ard_summary()
, but places summary statistics
wide, in separate columns.
All included variables must be of the same summary type, e.g. all continuous
summaries or all categorical summaries (which encompasses dichotomous variables).
Usage
tbl_ard_wide_summary(
cards,
statistic = switch(type[[1]], continuous = c("{median}", "{p25}, {p75}"), c("{n}",
"{p}%")),
type = NULL,
label = NULL,
value = NULL,
include = everything()
)
Arguments
cards |
( |
statistic |
( |
type |
( |
label |
( |
value |
( |
include |
( |
Value
a gtsummary table of class 'tbl_wide_summary'
Examples
library(cards)
ard_stack(
trial,
ard_continuous(variables = age),
.missing = TRUE,
.attributes = TRUE,
.total_n = TRUE
) |>
tbl_ard_wide_summary()
ard_stack(
trial,
ard_dichotomous(variables = response),
ard_categorical(variables = grade),
.missing = TRUE,
.attributes = TRUE,
.total_n = TRUE
) |>
tbl_ard_wide_summary()
Butcher table
Description
Some gtsummary objects can become large and the size becomes cumbersome
when working with the object.
The function removes all elements from a gtsummary object, except those
required to print the table. This may result in gtsummary functions
that add information or modify the table, such as add_global_p()
,
will no longer execute
after the excess elements have been removed (aka butchered). Of note,
the majority of inline_text()
calls will continue to execute
properly.
Usage
tbl_butcher(x, include = c("table_body", "table_styling"))
Arguments
x |
( |
include |
( |
Value
a gtsummary object
Examples
tbl_large <-
trial |>
tbl_uvregression(
y = age,
method = lm
)
tbl_butchered <-
tbl_large |>
tbl_butcher()
# size comparison
object.size(tbl_large) |> format(units = "Mb")
object.size(tbl_butchered)|> format(units = "Mb")
Summarize continuous variable
Description
Summarize a continuous variable by one or more categorical variables
Usage
tbl_continuous(
data,
variable,
include = everything(),
digits = NULL,
by = NULL,
statistic = everything() ~ "{median} ({p25}, {p75})",
label = NULL,
value = NULL
)
Arguments
data |
( |
variable |
( |
include |
( |
digits |
( |
by |
( |
statistic |
( |
label |
( |
value |
( |
Value
a gtsummary table
Examples
# Example 1 ----------------------------------
tbl_continuous(
data = trial,
variable = age,
by = trt,
include = grade
)
# Example 2 ----------------------------------
trial |>
dplyr::mutate(all_subjects = 1) |>
tbl_continuous(
variable = age,
statistic = ~"{mean} ({sd})",
by = trt,
include = c(all_subjects, stage, grade),
value = all_subjects ~ 1,
label = list(all_subjects = "All Subjects")
)
Cross table
Description
The function creates a cross table of categorical variables.
Usage
tbl_cross(
data,
row = 1L,
col = 2L,
label = NULL,
statistic = ifelse(percent == "none", "{n}", "{n} ({p}%)"),
digits = NULL,
percent = c("none", "column", "row", "cell"),
margin = c("column", "row"),
missing = c("ifany", "always", "no"),
missing_text = "Unknown",
margin_text = "Total"
)
Arguments
data |
( |
row |
( |
col |
( |
label |
( |
statistic |
( |
digits |
( |
percent |
( |
margin |
( |
missing |
( |
missing_text |
( |
margin_text |
( |
Value
A tbl_cross
object
Author(s)
Karissa Whiting, Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
trial |>
tbl_cross(row = trt, col = response) |>
bold_labels()
# Example 2 ----------------------------------
trial |>
tbl_cross(row = stage, col = trt, percent = "cell") |>
add_p() |>
bold_labels()
Create a table of summary statistics using a custom summary function
Description
The tbl_custom_summary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables.
This function is similar to tbl_summary()
but allows you to provide
a custom function in charge of computing the statistics (see Details).
Usage
tbl_custom_summary(
data,
by = NULL,
label = NULL,
stat_fns,
statistic,
digits = NULL,
type = NULL,
value = NULL,
missing = c("ifany", "no", "always"),
missing_text = "Unknown",
missing_stat = "{N_miss}",
include = everything(),
overall_row = FALSE,
overall_row_last = FALSE,
overall_row_label = "Overall"
)
Arguments
data |
( |
by |
( |
label |
( |
stat_fns |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat |
Arguments dictating how and if missing values are presented:
|
include |
( |
overall_row |
(scalar |
overall_row_last |
(scalar |
overall_row_label |
( |
Value
A tbl_custom_summary
object
Similarities with tbl_summary()
Please refer to the help file of tbl_summary()
regarding the use of select
helpers, and arguments include
, by
, type
, value
, digits
, missing
and
missing_text
.
stat_fns
argument
The stat_fns
argument specify the custom function(s) to be used for computing
the summary statistics. For example, stat_fns = everything() ~ foo
.
Each function may take the following arguments:
foo(data, full_data, variable, by, type, ...)
-
data=
is the input data frame passed totbl_custom_summary()
, subset according to the level ofby
orvariable
if any, excludingNA
values of the currentvariable
-
full_data=
is the full input data frame passed totbl_custom_summary()
-
variable=
is a string indicating the variable to perform the calculation on -
by=
is a string indicating the by variable fromtbl_custom_summary=
, if present -
type=
is a string indicating the type of variable (continuous, categorical, ...) -
stat_display=
a string indicating the statistic to display (for thestatistic
argument, for that variable)
The user-defined does not need to utilize each of these inputs. It's
encouraged the user-defined function accept ...
as each of the arguments
will be passed to the function, even if not all inputs are utilized by
the user's function, e.g. foo(data, ...)
(see examples).
The user-defined function should return a one row dplyr::tibble()
with
one column per summary statistics (see examples).
statistic argument
The statistic argument specifies the statistics presented in the table. The
input is a list of formulas that specify the statistics to report. For example,
statistic = list(age ~ "{mean} ({sd})")
.
A statistic name that appears between curly brackets
will be replaced with the numeric statistic (see glue::glue()
).
All the statistics indicated in the statistic argument should be returned
by the functions defined in the stat_fns
argument.
When the summary type is "continuous2"
, pass a vector of statistics. Each element
of the vector will result in a separate row in the summary table.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are also available to display.
-
{N_obs}
total number of observations -
{N_miss}
number of missing observations -
{N_nonmiss}
number of non-missing observations -
{p_miss}
percentage of observations missing -
{p_nonmiss}
percentage of observations not missing
Note that for categorical variables, {N_obs}
, {N_miss}
and {N_nonmiss}
refer
to the total number, number missing and number non missing observations
in the denominator, not at each level of the categorical variable.
It is recommended to use modify_footnote_header()
to properly describe the
displayed statistics (see examples).
Caution
The returned table is compatible with all gtsummary
features applicable
to a tbl_summary
object, like add_overall()
, modify_footnote_header()
or
bold_labels()
.
However, some of them could be inappropriate in such case. In particular,
add_p()
do not take into account the type of displayed statistics and
always return the p-value of a comparison test of the current variable
according to the by
groups, which may be incorrect if the displayed
statistics refer to a third variable.
Author(s)
Joseph Larmarange
Examples
# Example 1 ----------------------------------
my_stats <- function(data, ...) {
marker_sum <- sum(data$marker, na.rm = TRUE)
mean_age <- mean(data$age, na.rm = TRUE)
dplyr::tibble(
marker_sum = marker_sum,
mean_age = mean_age
)
}
my_stats(trial)
trial |>
tbl_custom_summary(
include = c("stage", "grade"),
by = "trt",
stat_fns = everything() ~ my_stats,
statistic = everything() ~ "A: {mean_age} - S: {marker_sum}",
digits = everything() ~ c(1, 0),
overall_row = TRUE,
overall_row_label = "All stages & grades"
) |>
add_overall(last = TRUE) |>
modify_footnote_header(
footnote = "A: mean age - S: sum of marker",
columns = all_stat_cols()
) |>
bold_labels()
# Example 2 ----------------------------------
# Use `data[[variable]]` to access the current variable
mean_ci <- function(data, variable, ...) {
test <- t.test(data[[variable]])
dplyr::tibble(
mean = test$estimate,
conf.low = test$conf.int[1],
conf.high = test$conf.int[2]
)
}
trial |>
tbl_custom_summary(
include = c("marker", "ttdeath"),
by = "trt",
stat_fns = ~ mean_ci,
statistic = ~ "{mean} [{conf.low}; {conf.high}]"
) |>
add_overall(last = TRUE) |>
modify_footnote_header(
footnote = "mean [95% CI]",
columns = all_stat_cols()
)
# Example 3 ----------------------------------
# Use `full_data` to access the full datasets
# Returned statistic can also be a character
diff_to_great_mean <- function(data, full_data, ...) {
mean <- mean(data$marker, na.rm = TRUE)
great_mean <- mean(full_data$marker, na.rm = TRUE)
diff <- mean - great_mean
dplyr::tibble(
mean = mean,
great_mean = great_mean,
diff = diff,
level = ifelse(diff > 0, "high", "low")
)
}
trial |>
tbl_custom_summary(
include = c("grade", "stage"),
by = "trt",
stat_fns = ~ diff_to_great_mean,
statistic = ~ "{mean} ({level}, diff: {diff})",
overall_row = TRUE
) |>
bold_labels()
Hierarchical Table
Description
Use these functions to generate hierarchical tables.
-
tbl_hierarchical()
: Calculates rates of events (e.g. adverse events) utilizing thedenominator
andid
arguments to identify the rows indata
to include in each rate calculation. Ifvariables
contains more than one variable and the last variable invariables
is an ordered factor, then rates of events by highest level will be calculated. -
tbl_hierarchical_count()
: Calculates counts of events utilizing all rows for each tabulation.
Usage
tbl_hierarchical(
data,
variables,
id,
denominator,
by = NULL,
include = everything(),
statistic = everything() ~ "{n} ({p}%)",
overall_row = FALSE,
label = NULL,
digits = NULL
)
tbl_hierarchical_count(
data,
variables,
denominator = NULL,
by = NULL,
include = everything(),
overall_row = FALSE,
statistic = everything() ~ "{n}",
label = NULL,
digits = NULL
)
Arguments
data |
( |
variables |
( |
id |
( |
denominator |
( |
by |
( |
include |
( |
statistic |
( |
overall_row |
(scalar |
label |
( |
digits |
( |
Value
a gtsummary table of class "tbl_hierarchical"
(for tbl_hierarchical()
) or "tbl_hierarchical_count"
(for tbl_hierarchical_count()
).
Overall Row
An overall row can be added to the table as the first row by specifying overall_row = TRUE
. Assuming that each row
in data
corresponds to one event record, this row will count the overall number of events recorded when used in
tbl_hierarchical_count()
, or the overall number of patients recorded with any event when used in
tbl_hierarchical()
.
A label for this overall row can be specified by passing an '..ard_hierarchical_overall..'
element in label
.
Similarly, the rounding for statistics in the overall row can be modified using the digits
argument,
again referencing the '..ard_hierarchical_overall..'
name.
Examples
ADAE_subset <- cards::ADAE |>
dplyr::filter(
AESOC %in% unique(cards::ADAE$AESOC)[1:5],
AETERM %in% unique(cards::ADAE$AETERM)[1:5]
)
# Example 1 - Event Rates --------------------
tbl_hierarchical(
data = ADAE_subset,
variables = c(AESOC, AETERM),
by = TRTA,
denominator = cards::ADSL |> mutate(TRTA = ARM),
id = USUBJID,
digits = everything() ~ list(p = 1),
overall_row = TRUE,
label = list(..ard_hierarchical_overall.. = "Any Adverse Event")
)
# Example 2 - Rates by Highest Severity ------
tbl_hierarchical(
data = ADAE_subset |> mutate(AESEV = factor(AESEV, ordered = TRUE)),
variables = c(AESOC, AESEV),
by = TRTA,
id = USUBJID,
denominator = cards::ADSL |> mutate(TRTA = ARM),
include = AESEV,
label = list(AESEV = "Highest Severity")
)
# Example 3 - Event Counts -------------------
tbl_hierarchical_count(
data = ADAE_subset,
variables = c(AESOC, AETERM, AESEV),
by = TRTA,
overall_row = TRUE,
label = list(..ard_hierarchical_overall.. = "Total Number of AEs")
)
Likert Summary
Description
Create a table of ordered categorical variables in a wide format.
Usage
tbl_likert(
data,
statistic = ~"{n} ({p}%)",
label = NULL,
digits = NULL,
include = everything(),
sort = c("ascending", "descending")
)
Arguments
data |
( |
statistic |
( |
label |
( |
digits |
( |
include |
( |
sort |
( |
Value
a 'tbl_likert' gtsummary table
Examples
levels <- c("Strongly Disagree", "Disagree", "Agree", "Strongly Agree")
df_likert <- data.frame(
recommend_friend = sample(levels, size = 20, replace = TRUE) |> factor(levels = levels),
regret_purchase = sample(levels, size = 20, replace = TRUE) |> factor(levels = levels)
)
# Example 1 ----------------------------------
tbl_likert_ex1 <-
df_likert |>
tbl_likert(include = c(recommend_friend, regret_purchase)) |>
add_n()
tbl_likert_ex1
# Example 2 ----------------------------------
# Add continuous summary of the likert scores
list(
tbl_likert_ex1,
tbl_wide_summary(
df_likert |> dplyr::mutate(dplyr::across(everything(), as.numeric)),
statistic = c("{mean}", "{sd}"),
type = ~"continuous",
include = c(recommend_friend, regret_purchase)
)
) |>
tbl_merge(tab_spanner = FALSE)
Merge tables
Description
Merge gtsummary tables, e.g. tbl_regression
, tbl_uvregression
, tbl_stack
,
tbl_summary
, tbl_svysummary
, etc.
This function merges like tables.
Generally, this means each of the tables being merged
should have the same structure.
When merging tables with different structures, rows may appear
out of order.
The ordering of rows can be updated with modify_table_body(~dplyr::arrange(.x, ...))
.
Usage
tbl_merge(tbls, tab_spanner = NULL, merge_vars = NULL, tbl_ids = NULL)
Arguments
tbls |
( |
tab_spanner |
( |
merge_vars |
( |
tbl_ids |
( |
Value
A 'tbl_merge'
object
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
# Side-by-side Regression Models
library(survival)
t1 <-
glm(response ~ trt + grade + age, trial, family = binomial) %>%
tbl_regression(exponentiate = TRUE)
t2 <-
coxph(Surv(ttdeath, death) ~ trt + grade + age, trial) %>%
tbl_regression(exponentiate = TRUE)
tbl_merge(
tbls = list(t1, t2),
tab_spanner = c("**Tumor Response**", "**Time to Death**")
)
# Example 2 ----------------------------------
# Descriptive statistics alongside univariate regression, with no spanning header
t3 <-
trial[c("age", "grade", "response")] %>%
tbl_summary(missing = "no") %>%
add_n() %>%
modify_header(stat_0 ~ "**Summary Statistics**")
t4 <-
tbl_uvregression(
trial[c("ttdeath", "death", "age", "grade", "response")],
method = coxph,
y = Surv(ttdeath, death),
exponentiate = TRUE,
hide_n = TRUE
)
tbl_merge(tbls = list(t3, t4)) %>%
modify_spanning_header(everything() ~ NA_character_)
Regression model summary
Description
This function takes a regression model object and returns a formatted table
that is publication-ready. The function is customizable
allowing the user to create bespoke regression model summary tables.
Review the
tbl_regression()
vignette
for detailed examples.
Usage
tbl_regression(x, ...)
## Default S3 method:
tbl_regression(
x,
label = NULL,
exponentiate = FALSE,
include = everything(),
show_single_row = NULL,
conf.level = 0.95,
intercept = FALSE,
estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()),
pvalue_fun = label_style_pvalue(digits = 1),
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
add_estimate_to_reference_rows = FALSE,
conf.int = TRUE,
...
)
Arguments
x |
(regression model) |
... |
Additional arguments passed to |
label |
( |
exponentiate |
(scalar |
include |
( |
show_single_row |
( |
conf.level |
(scalar |
intercept |
(scalar |
estimate_fun |
( |
pvalue_fun |
( |
tidy_fun |
( |
add_estimate_to_reference_rows |
(scalar |
conf.int |
(scalar |
Value
A tbl_regression
object
Methods
The default method for tbl_regression()
model summary uses broom::tidy(x)
to perform the initial tidying of the model object. There are, however,
a few models that use modifications.
-
"parsnip/workflows"
: If the model was prepared using parsnip/workflows, the original model fit is extracted and the originalx=
argument is replaced with the model fit. This will typically go unnoticed; however,if you've provided a custom tidier intidy_fun=
the tidier will be applied to the model fit object and not the parsnip/workflows object. -
"survreg"
: The scale parameter is removed,broom::tidy(x) %>% dplyr::filter(term != "Log(scale)")
-
"multinom"
: This multinomial outcome is complex, with one line per covariate per outcome (less the reference group) -
"gam"
: Uses the internal tidiertidy_gam()
to print both parametric and smooth terms. -
"lmerMod"
,"glmerMod"
,"glmmTMB"
,"glmmadmb"
,"stanreg"
,"brmsfit"
: These mixed effects models usebroom.mixed::tidy(x, effects = "fixed")
. Specifytidy_fun = broom.mixed::tidy
to print the random components.
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
glm(response ~ age + grade, trial, family = binomial()) |>
tbl_regression(exponentiate = TRUE)
Methods for tbl_regression
Description
Most regression models are handled by tbl_regression()
,
which uses broom::tidy()
to perform initial tidying of results. There are,
however, some model types that have modified default printing behavior.
Those methods are listed below.
Usage
## S3 method for class 'model_fit'
tbl_regression(x, ...)
## S3 method for class 'workflow'
tbl_regression(x, ...)
## S3 method for class 'survreg'
tbl_regression(
x,
tidy_fun = function(x, ...) dplyr::filter(broom::tidy(x, ...), .data$term !=
"Log(scale)"),
...
)
## S3 method for class 'mira'
tbl_regression(x, tidy_fun = pool_and_tidy_mice, ...)
## S3 method for class 'mipo'
tbl_regression(x, ...)
## S3 method for class 'lmerMod'
tbl_regression(
x,
tidy_fun = function(x, ...) broom.mixed::tidy(x, ..., effects = "fixed"),
...
)
## S3 method for class 'glmerMod'
tbl_regression(
x,
tidy_fun = function(x, ...) broom.mixed::tidy(x, ..., effects = "fixed"),
...
)
## S3 method for class 'glmmTMB'
tbl_regression(
x,
tidy_fun = function(x, ...) broom.mixed::tidy(x, ..., effects = "fixed"),
...
)
## S3 method for class 'glmmadmb'
tbl_regression(
x,
tidy_fun = function(x, ...) broom.mixed::tidy(x, ..., effects = "fixed"),
...
)
## S3 method for class 'stanreg'
tbl_regression(
x,
tidy_fun = function(x, ...) broom.mixed::tidy(x, ..., effects = "fixed"),
...
)
## S3 method for class 'brmsfit'
tbl_regression(
x,
tidy_fun = function(x, ...) broom.mixed::tidy(x, ..., effects = "fixed"),
...
)
## S3 method for class 'gam'
tbl_regression(x, tidy_fun = tidy_gam, ...)
## S3 method for class 'crr'
tbl_regression(x, ...)
Arguments
x |
(regression model) |
... |
arguments passed to |
tidy_fun |
( |
Methods
The default method for tbl_regression()
model summary uses broom::tidy(x)
to perform the initial tidying of the model object. There are, however,
a few models that use modifications.
-
"parsnip/workflows"
: If the model was prepared using parsnip/workflows, the original model fit is extracted and the originalx=
argument is replaced with the model fit. This will typically go unnoticed; however,if you've provided a custom tidier intidy_fun=
the tidier will be applied to the model fit object and not the parsnip/workflows object. -
"survreg"
: The scale parameter is removed,broom::tidy(x) %>% dplyr::filter(term != "Log(scale)")
-
"multinom"
: This multinomial outcome is complex, with one line per covariate per outcome (less the reference group) -
"gam"
: Uses the internal tidiertidy_gam()
to print both parametric and smooth terms. -
"lmerMod"
,"glmerMod"
,"glmmTMB"
,"glmmadmb"
,"stanreg"
,"brmsfit"
: These mixed effects models usebroom.mixed::tidy(x, effects = "fixed")
. Specifytidy_fun = broom.mixed::tidy
to print the random components.
Split gtsummary table by rows and/or columns
Description
The tbl_split_by_rows()
and tbl_split_by_columns()
functions split a single
gtsummary table into multiple tables.
Both column-wise splitting (that is, splits by columns in
x$table_body
) and row-wise splitting is possible.
Usage
tbl_split_by_rows(
x,
variables = NULL,
row_numbers = NULL,
footnotes = c("all", "first", "last"),
caption = c("all", "first", "last")
)
tbl_split_by_columns(
x,
keys,
groups,
footnotes = c("all", "first", "last"),
caption = c("all", "first", "last")
)
## S3 method for class 'tbl_split'
print(x, ...)
Arguments
x |
( |
variables , row_numbers |
( |
footnotes , caption |
( |
keys |
( |
groups |
(list of |
... |
These dots are for future extensions and must be empty. |
Details
Run show_header_names()
to print all column names to split by.
Footnotes and caption handling are experimental and may change in the future.
row_numbers
indicates the row numbers at which to split the table. It means
that the table will be split after each of these row numbers. If the last
row is selected, the split will not happen as it is supposed to happen after
the last row.
Value
tbl_split
object. If multiple splits are performed (e.g., both by
row and columns), the output is returned a single level list.
Examples
# Example 1 ----------------------------------
# Split by rows
trial |>
tbl_summary(by = trt) |>
tbl_split_by_rows(variables = c(marker, grade)) |>
tail(n = 1) # Print only last table for simplicity
# Example 2 ----------------------------------
# Split by rows with row numbers
trial |>
tbl_summary(by = trt) |>
tbl_split_by_rows(row_numbers = c(5, 7)) |>
tail(n = 1) # Print only last table for simplicity
# Example 3 ----------------------------------
# Split by columns
trial |>
tbl_summary(by = trt, include = c(death, ttdeath)) |>
tbl_split_by_columns(groups = list("stat_1", "stat_2")) |>
tail(n = 1) # Print only last table for simplicity
# Example 4 ----------------------------------
# Both row and column splitting
trial |>
tbl_summary(by = trt) |>
tbl_split_by_rows(variables = c(marker, grade)) |>
tbl_split_by_columns(groups = list("stat_1", "stat_2")) |>
tail(n = 1) # Print only last table for simplicity
# Example 5 ------------------------------
# Split by rows with footnotes and caption
trial |>
tbl_summary(by = trt, missing = "no") |>
modify_footnote_header(
footnote = "All but four subjects received both treatments in a crossover design",
columns = all_stat_cols(),
replace = FALSE
) |>
modify_footnote_body(
footnote = "Tumor grade was assessed _before_ treatment began",
columns = "label",
rows = variable == "grade" & row_type == "label"
) |>
modify_spanning_header(
c(stat_1, stat_2) ~ "**TRT**"
) |>
modify_abbreviation("I = 1, II = 2, III = 3") |>
modify_caption("_Some caption_") |>
modify_footnote_spanning_header(
footnote = "Treatment",
columns = c(stat_1)
) |>
modify_source_note("Some source note!") |>
tbl_split_by_rows(variables = c(marker, stage, grade), footnotes = "last", caption = "first") |>
tail(n = 2) |>
head(n = 1) # Print only one but not last table for simplicity
Stack tables
Description
Assists in patching together more complex tables. tbl_stack()
appends two
or more gtsummary tables.
Usage
tbl_stack(
tbls,
group_header = NULL,
quiet = FALSE,
attr_order = seq_along(tbls),
tbl_ids = NULL
)
Arguments
Value
A tbl_stack
object
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
# stacking two tbl_regression objects
t1 <-
glm(response ~ trt, trial, family = binomial) %>%
tbl_regression(
exponentiate = TRUE,
label = list(trt ~ "Treatment (unadjusted)")
)
t2 <-
glm(response ~ trt + grade + stage + marker, trial, family = binomial) %>%
tbl_regression(
include = "trt",
exponentiate = TRUE,
label = list(trt ~ "Treatment (adjusted)")
)
tbl_stack(list(t1, t2))
# Example 2 ----------------------------------
# stacking two tbl_merge objects
library(survival)
t3 <-
coxph(Surv(ttdeath, death) ~ trt, trial) %>%
tbl_regression(
exponentiate = TRUE,
label = list(trt ~ "Treatment (unadjusted)")
)
t4 <-
coxph(Surv(ttdeath, death) ~ trt + grade + stage + marker, trial) %>%
tbl_regression(
include = "trt",
exponentiate = TRUE,
label = list(trt ~ "Treatment (adjusted)")
)
# first merging, then stacking
row1 <- tbl_merge(list(t1, t3), tab_spanner = c("Tumor Response", "Death"))
row2 <- tbl_merge(list(t2, t4))
tbl_stack(list(row1, row2), group_header = c("Unadjusted Analysis", "Adjusted Analysis"))
Stratified gtsummary tables
Description
Build a stratified gtsummary table. Any gtsummary table that accepts a data frame as its first argument can be stratified.
In
tbl_strata()
, the stratified or subset data frame is passed to the function in.tbl_fun=
, e.g.purrr::map(data, .tbl_fun)
.In
tbl_strata2()
, both the stratified data frame and the strata level are passed to.tbl_fun=
, e.g.purrr::map2(data, strata, .tbl_fun)
.
When merging, keep in mind that merging works best with like tables.
See tbl_merge()
for details.
Usage
tbl_strata(
data,
strata,
.tbl_fun,
...,
.sep = ", ",
.combine_with = c("tbl_merge", "tbl_stack"),
.combine_args = NULL,
.header = ifelse(.combine_with == "tbl_merge", "**{strata}**", "{strata}"),
.quiet = NULL
)
tbl_strata2(
data,
strata,
.tbl_fun,
...,
.sep = ", ",
.combine_with = c("tbl_merge", "tbl_stack"),
.combine_args = NULL,
.header = ifelse(.combine_with == "tbl_merge", "**{strata}**", "{strata}"),
.quiet = TRUE
)
Arguments
data |
( |
strata |
( |
.tbl_fun |
( |
... |
Additional arguments passed on to the |
.sep |
( |
.combine_with |
( |
.combine_args |
(named |
.header |
( The evaluated value of |
.quiet |
Tips
-
tbl_summary()
The number of digits continuous variables are rounded to is determined separately within each stratum of the data frame. Set the
digits=
argument to ensure continuous variables are rounded to the same number of decimal places.If some levels of a categorical variable are unobserved within a stratum, convert the variable to a factor to ensure all levels appear in each stratum's summary table.
The summary type for variables (e.g. continuous vs categorical vs dichotomous) are determined separately within stratum. Use the
tbl_summary(type)
argument to assign a summary type consistent across all tables being combined.By default, a "missing" row appears when there are missing values only. Use the
tbl_summary(missing)
argument to ensure there is always/never a missing row for the combining of the tables.
Author(s)
Daniel D. Sjoberg
Examples
# Example 1 ----------------------------------
trial |>
select(age, grade, stage, trt) |>
mutate(grade = paste("Grade", grade)) |>
tbl_strata(
strata = grade,
.tbl_fun =
~ .x |>
tbl_summary(by = trt, missing = "no") |>
add_n(),
.header = "**{strata}**, N = {n}"
)
# Example 2 ----------------------------------
trial |>
select(grade, response) |>
mutate(grade = paste("Grade", grade)) |>
tbl_strata2(
strata = grade,
.tbl_fun =
~ .x %>%
tbl_summary(
label = list(response = .y),
missing = "no",
statistic = response ~ "{p}%"
) |>
add_ci(pattern = "{stat} ({ci})") |>
modify_header(stat_0 = "**Rate (95% CI)**") |>
remove_footnote_header(stat_0),
.combine_with = "tbl_stack",
.combine_args = list(group_header = NULL)
) |>
modify_caption("**Response Rate by Grade**")
Stratified Nested Stacking
Description
This function stratifies your data frame, builds gtsummary tables, and
stacks the resulting tables in a nested style. The underlying functionality
is similar to tbl_strata()
, except the resulting tables are nested or indented
within each group.
NOTE: The header from the first table is used for the final table. Oftentimes, this header will include incorrect Ns and must be updated.
Usage
tbl_strata_nested_stack(
data,
strata,
.tbl_fun,
...,
row_header = "{strata}",
quiet = FALSE
)
Arguments
data |
( |
strata |
( |
.tbl_fun |
( |
... |
Additional arguments passed on to the |
row_header |
( |
quiet |
(scalar |
Value
a stacked 'gtsummary' table
Examples
# Example 1 ----------------------------------
tbl_strata_nested_stack(
trial,
strata = trt,
.tbl_fun = ~ .x |>
tbl_summary(include = c(age, grade), missing = "no") |>
modify_header(all_stat_cols() ~ "**Summary Statistics**")
)
# Example 2 ----------------------------------
tbl_strata_nested_stack(
trial,
strata = trt,
.tbl_fun = ~ .x |>
tbl_summary(include = c(age, grade), missing = "no") |>
modify_header(all_stat_cols() ~ "**Summary Statistics**"),
row_header = "{strata}, n={n}"
) |>
# bold the row headers; print `x$table_body` to see hidden columns
modify_bold(columns = "label", rows = tbl_indent_id1 > 0)
Summary table
Description
The tbl_summary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables.
Review the
tbl_summary vignette
for detailed examples.
Usage
tbl_summary(
data,
by = NULL,
label = NULL,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~
"{n} ({p}%)"),
digits = NULL,
type = NULL,
value = NULL,
missing = c("ifany", "no", "always"),
missing_text = "Unknown",
missing_stat = "{N_miss}",
sort = all_categorical(FALSE) ~ "alphanumeric",
percent = c("column", "row", "cell"),
include = everything()
)
Arguments
data |
( |
by |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat |
Arguments dictating how and if missing values are presented:
|
sort |
( |
percent |
( In rarer cases, you may need to define/override the typical denominators.
In these cases, pass an integer or a data frame. Refer to the
|
include |
( |
Value
a gtsummary table of class "tbl_summary"
A table of class c('tbl_summary', 'gtsummary')
statistic argument
The statistic argument specifies the statistics presented in the table. The
input dictates the summary statistics presented in the table. For example,
statistic = list(age ~ "{mean} ({sd})")
would report the mean and
standard deviation for age; statistic = list(all_continuous() ~ "{mean} ({sd})")
would report the mean and standard deviation for all continuous variables.
The values are interpreted using glue::glue()
syntax:
a name that appears between curly brackets will be interpreted as a function
name and the formatted result of that function will be placed in the table.
For categorical variables, the following statistics are available to display:
{n}
(frequency), {N}
(denominator), {p}
(percent).
For continuous variables, any univariate function may be used.
The most commonly used functions are {median}
, {mean}
, {sd}
, {min}
,
and {max}
.
Additionally, {p##}
is available for percentiles, where ##
is an integer from 0 to 100.
For example, p25: quantile(probs=0.25, type=2)
.
When the summary type is "continuous2"
, pass a vector of statistics.
Each element of the vector will result in a separate row in the summary table.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are available to display.
-
{N_obs}
total number of observations -
{N_miss}
number of missing observations -
{N_nonmiss}
number of non-missing observations -
{p_miss}
percentage of observations missing -
{p_nonmiss}
percentage of observations not missing
digits argument
The digits argument specifies the the number of digits (or formatting function) statistics are rounded to.
The values passed can either be a single integer, a vector of integers, a
function, or a list of functions. If a single integer or function is passed,
it is recycled to the length of the number of statistics presented.
For example, if the statistic is "{mean} ({sd})"
, it is equivalent to
pass 1
, c(1, 1)
, label_style_number(digits=1)
, and
list(label_style_number(digits=1), label_style_number(digits=1))
.
Named lists are also accepted to change the default formatting for a single
statistic, e.g. list(sd = label_style_number(digits=1))
.
type and value arguments
There are four summary types. Use the type
argument to change the default summary types.
-
"continuous"
summaries are shown on a single row. Most numeric variables default to summary type continuous. -
"continuous2"
summaries are shown on 2 or more rows -
"categorical"
multi-line summaries of nominal data. Character variables, factor variables, and numeric variables with fewer than 10 unique levels default to type categorical. To change a numeric variable to continuous that defaulted to categorical, usetype = list(varname ~ "continuous")
-
"dichotomous"
categorical variables that are displayed on a single row, rather than one row per level of the variable. Variables coded asTRUE
/FALSE
,0
/1
, oryes
/no
are assumed to be dichotomous, and theTRUE
,1
, andyes
rows are displayed. Otherwise, the value to display must be specified in thevalue
argument, e.g.value = list(varname ~ "level to show")
Author(s)
Daniel D. Sjoberg
See Also
See tbl_summary vignette for detailed tutorial
See table gallery for additional examples
Review list, formula, and selector syntax used throughout gtsummary
Examples
# Example 1 ----------------------------------
trial |>
select(age, grade, response) |>
tbl_summary()
# Example 2 ----------------------------------
trial |>
tbl_summary(
by = trt,
include = c(age, grade, response, trt),
label = list(age = "Patient Age"),
statistic = list(all_continuous() ~ "{mean} ({sd})"),
digits = list(age = c(0, 1))
)
# Example 3 ----------------------------------
trial |>
tbl_summary(
include = c(age, marker),
type = all_continuous() ~ "continuous2",
statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min}, {max}"),
missing = "no"
)
Survival table
Description
Function takes a survfit
object as an argument, and provides a
formatted summary table of the results.
No more than one stratifying variable is allowed in each model.
If you're experiencing unexpected errors using tbl_survfit()
,
please review ?tbl_survfit_errors for a possible explanation.
Usage
tbl_survfit(x, ...)
## S3 method for class 'survfit'
tbl_survfit(x, ...)
## S3 method for class 'data.frame'
tbl_survfit(x, y, include = everything(), conf.level = 0.95, ...)
## S3 method for class 'list'
tbl_survfit(
x,
times = NULL,
probs = NULL,
statistic = "{estimate} ({conf.low}, {conf.high})",
label = NULL,
label_header = ifelse(!is.null(times), "**Time {time}**",
"**{style_sigfig(prob, scale=100)}% Percentile**"),
estimate_fun = ifelse(!is.null(times), label_style_percent(suffix = "%"),
label_style_sigfig()),
missing = "--",
type = NULL,
reverse = FALSE,
quiet = TRUE,
...
)
Arguments
x |
( | |||||||||
... |
For | |||||||||
y |
outcome call, e.g. | |||||||||
include |
Variable to include as stratifying variables. | |||||||||
conf.level |
(scalar | |||||||||
times |
( | |||||||||
probs |
( | |||||||||
statistic |
( | |||||||||
label |
( | |||||||||
label_header |
( | |||||||||
estimate_fun |
( | |||||||||
missing |
( | |||||||||
type |
(
| |||||||||
reverse |
||||||||||
quiet |
Formula Specification
When passing a survival::survfit()
object to tbl_survfit()
,
the survfit()
call must use an evaluated formula and not a stored formula.
Including a proper formula in the call allows the function to accurately
identify all variables included in the estimation. See below for examples:
library(gtsummary) library(survival) # include formula in `survfit()` call survfit(Surv(time, status) ~ sex, lung) |> tbl_survfit(times = 500) # you can also pass a data frame to `tbl_survfit()` as well. lung |> tbl_survfit(y = Surv(time, status), include = "sex", times = 500)
You cannot, however, pass a stored formula, e.g. survfit(my_formula, lung)
,
but you can use stored formulas with rlang::inject(survfit(!!my_formula, lung))
.
Author(s)
Daniel D. Sjoberg
Examples
library(survival)
# Example 1 ----------------------------------
# Pass single survfit() object
tbl_survfit(
survfit(Surv(ttdeath, death) ~ trt, trial),
times = c(12, 24),
label_header = "**{time} Month**"
)
# Example 2 ----------------------------------
# Pass a data frame
tbl_survfit(
trial,
y = "Surv(ttdeath, death)",
include = c(trt, grade),
probs = 0.5,
label_header = "**Median Survival**"
)
# Example 3 ----------------------------------
# Pass a list of survfit() objects
list(survfit(Surv(ttdeath, death) ~ 1, trial),
survfit(Surv(ttdeath, death) ~ trt, trial)) |>
tbl_survfit(times = c(12, 24))
# Example 4 Competing Events Example ---------
# adding a competing event for death (cancer vs other causes)
set.seed(1123)
library(dplyr, warn.conflicts = FALSE, quietly = TRUE)
trial2 <- trial |>
dplyr::mutate(
death_cr =
dplyr::case_when(
death == 0 ~ "censor",
runif(n()) < 0.5 ~ "death from cancer",
TRUE ~ "death other causes"
) |>
factor()
)
survfit(Surv(ttdeath, death_cr) ~ grade, data = trial2) |>
tbl_survfit(times = c(12, 24), label = "Tumor Grade")
Common Sources of Error with tbl_survfit()
Description
When functions add_n()
and add_p()
are run after tbl_survfit()
,
the original call to survival::survfit()
is extracted and the formula=
and data=
arguments are used to calculate
the N or p-value.
When the values of the formula=
and data=
are unavailable, the functions
cannot execute. Below are some tips to modify your code to ensure all functions
run without issue.
Let
tbl_survfit()
construct thesurvival::survfit()
for you by passing a data frame totbl_survfit()
. The survfit model will be constructed in a manner ensuring the formula and data are available. This only works if you have a stratified model.Instead of the following line
survfit(Surv(ttdeath, death) ~ trt, trial) %>% tbl_survfit(times = c(12, 24))
Use this code
trial %>% select(ttdeath, death, trt) %>% tbl_survfit(y = Surv(ttdeath, death), times = c(12, 24))
Construct an expression of the
survival::survfit()
before evaluating it. Ensure the formula and data are available in the call by using the tidyverse bang-bang operator,!!
.Use this code
formula_arg <- Surv(ttdeath, death) ~ 1 data_arg <- trial rlang::expr(survfit(!!formula_arg, !!data_arg)) %>% eval() %>% tbl_survfit(times = c(12, 24))
Create a table of summary statistics from a survey object
Description
The tbl_svysummary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables taking into account survey weights and design.
Usage
tbl_svysummary(
data,
by = NULL,
label = NULL,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~
"{n} ({p}%)"),
digits = NULL,
type = NULL,
value = NULL,
missing = c("ifany", "no", "always"),
missing_text = "Unknown",
missing_stat = "{N_miss}",
sort = all_categorical(FALSE) ~ "alphanumeric",
percent = c("column", "row", "cell"),
include = everything()
)
Arguments
data |
( |
by |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat |
Arguments dictating how and if missing values are presented:
|
sort |
( |
percent |
( |
include |
( |
Value
A 'tbl_svysummary'
object
statistic argument
The statistic argument specifies the statistics presented in the table. The
input is a list of formulas that specify the statistics to report. For example,
statistic = list(age ~ "{mean} ({sd})")
would report the mean and
standard deviation for age; statistic = list(all_continuous() ~ "{mean} ({sd})")
would report the mean and standard deviation for all continuous variables.
A statistic name that appears between curly brackets
will be replaced with the numeric statistic (see glue::glue()
).
For categorical variables the following statistics are available to display.
-
{n}
frequency -
{N}
denominator, or cohort size -
{p}
proportion -
{p.std.error}
standard error of the sample proportion (on the 0 to 1 scale) computed withsurvey::svymean()
-
{deff}
design effect of the sample proportion computed withsurvey::svymean()
-
{n_unweighted}
unweighted frequency -
{N_unweighted}
unweighted denominator -
{p_unweighted}
unweighted formatted percentage
For continuous variables the following statistics are available to display.
-
{median}
median -
{mean}
mean -
{mean.std.error}
standard error of the sample mean computed withsurvey::svymean()
-
{deff}
design effect of the sample mean computed withsurvey::svymean()
-
{sd}
standard deviation -
{var}
variance -
{min}
minimum -
{max}
maximum -
{p##}
any integer percentile, where##
is an integer from 0 to 100 -
{sum}
sum
Unlike tbl_summary()
, it is not possible to pass a custom function.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are available to display.
-
{N_obs}
total number of observations -
{N_miss}
number of missing observations -
{N_nonmiss}
number of non-missing observations -
{p_miss}
percentage of observations missing -
{p_nonmiss}
percentage of observations not missing -
{N_obs_unweighted}
unweighted total number of observations -
{N_miss_unweighted}
unweighted number of missing observations -
{N_nonmiss_unweighted}
unweighted number of non-missing observations -
{p_miss_unweighted}
unweighted percentage of observations missing -
{p_nonmiss_unweighted}
unweighted percentage of observations not missing
Note that for categorical variables, {N_obs}
, {N_miss}
and {N_nonmiss}
refer
to the total number, number missing and number non missing observations
in the denominator, not at each level of the categorical variable.
type and value arguments
There are four summary types. Use the type
argument to change the default summary types.
-
"continuous"
summaries are shown on a single row. Most numeric variables default to summary type continuous. -
"continuous2"
summaries are shown on 2 or more rows -
"categorical"
multi-line summaries of nominal data. Character variables, factor variables, and numeric variables with fewer than 10 unique levels default to type categorical. To change a numeric variable to continuous that defaulted to categorical, usetype = list(varname ~ "continuous")
-
"dichotomous"
categorical variables that are displayed on a single row, rather than one row per level of the variable. Variables coded asTRUE
/FALSE
,0
/1
, oryes
/no
are assumed to be dichotomous, and theTRUE
,1
, andyes
rows are displayed. Otherwise, the value to display must be specified in thevalue
argument, e.g.value = list(varname ~ "level to show")
Author(s)
Joseph Larmarange
Examples
# Example 1 ----------------------------------
survey::svydesign(~1, data = as.data.frame(Titanic), weights = ~Freq) |>
tbl_svysummary(by = Survived, percent = "row", include = c(Class, Age))
# Example 2 ----------------------------------
# A dataset with a complex design
data(api, package = "survey")
survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) |>
tbl_svysummary(by = "both", include = c(api00, stype)) |>
modify_spanning_header(all_stat_cols() ~ "**Survived**")
Univariable regression model summary
Description
This function estimates univariable regression models and returns them in a publication-ready table. It can create regression models holding either a covariate or an outcome constant.
Usage
tbl_uvregression(data, ...)
## S3 method for class 'data.frame'
tbl_uvregression(
data,
y = NULL,
x = NULL,
method,
method.args = list(),
exponentiate = FALSE,
label = NULL,
include = everything(),
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
hide_n = FALSE,
show_single_row = NULL,
conf.level = 0.95,
estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()),
pvalue_fun = label_style_pvalue(digits = 1),
formula = "{y} ~ {x}",
add_estimate_to_reference_rows = FALSE,
conf.int = TRUE,
...
)
## S3 method for class 'survey.design'
tbl_uvregression(
data,
y = NULL,
x = NULL,
method,
method.args = list(),
exponentiate = FALSE,
label = NULL,
include = everything(),
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
hide_n = FALSE,
show_single_row = NULL,
conf.level = 0.95,
estimate_fun = ifelse(exponentiate, label_style_ratio(), label_style_sigfig()),
pvalue_fun = label_style_pvalue(digits = 1),
formula = "{y} ~ {x}",
add_estimate_to_reference_rows = FALSE,
conf.int = TRUE,
...
)
Arguments
data |
( |
... |
Additional arguments passed to |
y , x |
( |
method |
( |
method.args |
(named |
exponentiate |
(scalar |
label |
( |
include |
( |
tidy_fun |
( |
hide_n |
(scalar |
show_single_row |
( |
conf.level |
(scalar |
estimate_fun |
( |
pvalue_fun |
( |
formula |
( |
add_estimate_to_reference_rows |
(scalar |
conf.int |
(scalar |
Value
A tbl_uvregression
object
x
and y
arguments
For models holding outcome constant, the function takes as arguments a data frame,
the type of regression model, and the outcome variable y=
. Each column in the
data frame is regressed on the specified outcome. The tbl_uvregression()
function arguments are similar to the tbl_regression()
arguments. Review the
tbl_uvregression vignette
for detailed examples.
You may alternatively hold a single covariate constant. For this, pass a data
frame, the type of regression model, and a single
covariate in the x=
argument. Each column of the data frame will serve as
the outcome in a univariate regression model. Take care using the x
argument
that each of the columns in the data frame are appropriate for the same type
of model, e.g. they are all continuous variables appropriate for lm, or
dichotomous variables appropriate for logistic regression with glm.
Methods
The default method for tbl_regression()
model summary uses broom::tidy(x)
to perform the initial tidying of the model object. There are, however,
a few models that use modifications.
-
"parsnip/workflows"
: If the model was prepared using parsnip/workflows, the original model fit is extracted and the originalx=
argument is replaced with the model fit. This will typically go unnoticed; however,if you've provided a custom tidier intidy_fun=
the tidier will be applied to the model fit object and not the parsnip/workflows object. -
"survreg"
: The scale parameter is removed,broom::tidy(x) %>% dplyr::filter(term != "Log(scale)")
-
"multinom"
: This multinomial outcome is complex, with one line per covariate per outcome (less the reference group) -
"gam"
: Uses the internal tidiertidy_gam()
to print both parametric and smooth terms. -
"lmerMod"
,"glmerMod"
,"glmmTMB"
,"glmmadmb"
,"stanreg"
,"brmsfit"
: These mixed effects models usebroom.mixed::tidy(x, effects = "fixed")
. Specifytidy_fun = broom.mixed::tidy
to print the random components.
Author(s)
Daniel D. Sjoberg
See Also
See tbl_regression vignette for detailed examples
Examples
# Example 1 ----------------------------------
tbl_uvregression(
trial,
method = glm,
y = response,
method.args = list(family = binomial),
exponentiate = TRUE,
include = c("age", "grade")
)
# Example 2 ----------------------------------
# rounding pvalues to 2 decimal places
library(survival)
tbl_uvregression(
trial,
method = coxph,
y = Surv(ttdeath, death),
exponentiate = TRUE,
include = c("age", "grade", "response"),
pvalue_fun = label_style_pvalue(digits = 2)
)
Wide summary table
Description
This function is similar to tbl_summary()
, but places summary statistics
wide, in separate columns.
All included variables must be of the same summary type, e.g. all continuous
summaries or all categorical summaries (which encompasses dichotomous variables).
Usage
tbl_wide_summary(
data,
label = NULL,
statistic = switch(type[[1]], continuous = c("{median}", "{p25}, {p75}"), c("{n}",
"{p}%")),
digits = NULL,
type = NULL,
value = NULL,
sort = all_categorical(FALSE) ~ "alphanumeric",
include = everything()
)
Arguments
data |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
sort |
( |
include |
( |
Value
a gtsummary table of class 'tbl_wide_summary'
Examples
trial |>
tbl_wide_summary(include = c(response, grade))
trial |>
tbl_strata(
strata = trt,
~tbl_wide_summary(.x, include = c(age, marker))
)
Comparison tests/methods available
Description
Below is a listing of tests available internally within gtsummary.
These methods are available to be called in add_p()
, add_difference()
, and add_difference_row()
Tests listed with ...
may have additional arguments
passed to them using add_p(test.args=)
. For example, to
calculate a p-value from t.test()
assuming equal variance, use
tbl_summary(trial, by = trt) %>% add_p(age ~ "t.test", test.args = age ~ list(var.equal = TRUE))
tbl_summary() %>% add_p()
alias | description | pseudo-code | details |
't.test' | t-test | t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...) | |
'mood.test' | Mood two-sample test of scale | mood.test(variable ~ as.factor(by), data = data, ...) | Not to be confused with the Brown-Mood test of medians |
'oneway.test' | One-way ANOVA | oneway.test(variable ~ as.factor(by), data = data, ...) | |
'kruskal.test' | Kruskal-Wallis test | kruskal.test(data[[variable]], as.factor(data[[by]])) | |
'wilcox.test' | Wilcoxon rank-sum test | wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...) | |
'chisq.test' | chi-square test of independence | chisq.test(x = data[[variable]], y = as.factor(data[[by]]), ...) | |
'chisq.test.no.correct' | chi-square test of independence | chisq.test(x = data[[variable]], y = as.factor(data[[by]]), correct = FALSE) | |
'fisher.test' | Fisher's exact test | fisher.test(data[[variable]], as.factor(data[[by]]), conf.level = 0.95, ...) | |
'mcnemar.test' | McNemar's test | tidyr::pivot_wider(id_cols = group, ...); mcnemar.test(by_1, by_2, conf.level = 0.95, ...) | |
'mcnemar.test.wide' | McNemar's test | mcnemar.test(data[[variable]], data[[by]], conf.level = 0.95, ...) | |
'lme4' | random intercept logistic regression | lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(by ~ variable + (1 \UFF5C group), data, family = binomial)) | |
'paired.t.test' | Paired t-test | tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...) | |
'paired.wilcox.test' | Paired Wilcoxon rank-sum test | tidyr::pivot_wider(id_cols = group, ...); wilcox.test(by_1, by_2, paired = TRUE, conf.int = TRUE, conf.level = 0.95, ...) | |
'prop.test' | Test for equality of proportions | prop.test(x, n, conf.level = 0.95, ...) | For dichotomous comparisons, the 'variable' is first converted to a logical. |
'ancova' | ANCOVA | lm(variable ~ by + adj.vars) | |
'emmeans' | Estimated Marginal Means or LS-means | lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept. |
tbl_svysummary() %>% add_p()
alias | description | pseudo-code | details |
'svy.t.test' | t-test adapted to complex survey samples | survey::svyttest(~variable + by, data) | |
'svy.wilcox.test' | Wilcoxon rank-sum test for complex survey samples | survey::svyranktest(~variable + by, data, test = 'wilcoxon') | |
'svy.kruskal.test' | Kruskal-Wallis rank-sum test for complex survey samples | survey::svyranktest(~variable + by, data, test = 'KruskalWallis') | |
'svy.vanderwaerden.test' | van der Waerden's normal-scores test for complex survey samples | survey::svyranktest(~variable + by, data, test = 'vanderWaerden') | |
'svy.median.test' | Mood's test for the median for complex survey samples | survey::svyranktest(~variable + by, data, test = 'median') | |
'svy.chisq.test' | chi-squared test with Rao & Scott's second-order correction | survey::svychisq(~variable + by, data, statistic = 'F') | |
'svy.adj.chisq.test' | chi-squared test adjusted by a design effect estimate | survey::svychisq(~variable + by, data, statistic = 'Chisq') | |
'svy.wald.test' | Wald test of independence for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'Wald') | |
'svy.adj.wald.test' | adjusted Wald test of independence for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'adjWald') | |
'svy.lincom.test' | test of independence using the exact asymptotic distribution for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'lincom') | |
'svy.saddlepoint.test' | test of independence using a saddlepoint approximation for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'saddlepoint') | |
'emmeans' | Estimated Marginal Means or LS-means | survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used. |
tbl_survfit() %>% add_p()
alias | description | pseudo-code |
'logrank' | Log-rank test | survival::survdiff(Surv(.) ~ variable, data, rho = 0) |
'tarone' | Tarone-Ware test | survival::survdiff(Surv(.) ~ variable, data, rho = 1.5) |
'petopeto_gehanwilcoxon' | Peto & Peto modification of Gehan-Wilcoxon test | survival::survdiff(Surv(.) ~ variable, data, rho = 1) |
'survdiff' | G-rho family test | survival::survdiff(Surv(.) ~ variable, data, ...) |
'coxph_lrt' | Cox regression (LRT) | survival::coxph(Surv(.) ~ variable, data, ...) |
'coxph_wald' | Cox regression (Wald) | survival::coxph(Surv(.) ~ variable, data, ...) |
'coxph_score' | Cox regression (Score) | survival::coxph(Surv(.) ~ variable, data, ...) |
tbl_continuous() %>% add_p()
alias | description | pseudo-code |
'anova_2way' | Two-way ANOVA | lm(continuous_variable ~ by + variable) %>% broom::glance() |
't.test' | t-test | t.test(continuous_variable ~ variable, data = data, conf.level = 0.95, ...) |
'oneway.test' | One-way ANOVA | oneway.test(continuous_variable ~ variable, data = data) |
'kruskal.test' | Kruskal-Wallis test | kruskal.test(x = data[[continuous_variable]], g = data[[variable]]) |
'wilcox.test' | Wilcoxon rank-sum test | wilcox.test(continuous_variable ~ variable, data = data, ...) |
'lme4' | random intercept logistic regression | lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(variable ~ continuous_variable + (1 \UFF5C group), data, family = binomial)) |
'ancova' | ANCOVA | lm(continuous_variable ~ variable + adj.vars) |
tbl_summary() %>% add_difference()/add_difference_row()
alias | description | difference statistic | pseudo-code | details |
't.test' | t-test | mean difference | t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...) | |
'wilcox.test' | Wilcoxon rank-sum test | wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...) | ||
'paired.t.test' | Paired t-test | mean difference | tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...) | |
'prop.test' | Test for equality of proportions | rate difference | prop.test(x, n, conf.level = 0.95, ...) | For dichotomous comparisons, the 'variable' is first converted to a logical. |
'ancova' | ANCOVA | mean difference | lm(variable ~ by + adj.vars) | |
'ancova_lme4' | ANCOVA with random intercept | mean difference | lme4::lmer(variable ~ by + adj.vars + (1 \UFF5C group), data) | |
'cohens_d' | Cohen's D | standardized mean difference | effectsize::cohens_d(variable ~ by, data, ci = conf.level, verbose = FALSE, ...) | |
'hedges_g' | Hedge's G | standardized mean difference | effectsize::hedges_g(variable ~ by, data, ci = conf.level, verbose = FALSE, ...) | |
'paired_cohens_d' | Paired Cohen's D | standardized mean difference | tidyr::pivot_wider(id_cols = group, ...); effectsize::cohens_d(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...) | |
'paired_hedges_g' | Paired Hedge's G | standardized mean difference | tidyr::pivot_wider(id_cols = group, ...); effectsize::hedges_g(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...) | |
'smd' | Standardized Mean Difference | standardized mean difference | smd::smd(x = data[[variable]], g = data[[by]], std.error = TRUE) | |
'emmeans' | Estimated Marginal Means or LS-means | adjusted mean difference | lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept. |
tbl_svysummary() %>% add_difference()
alias | description | difference statistic | pseudo-code | details |
'smd' | Standardized Mean Difference | standardized mean difference | smd::smd(x = variable, g = by, w = weights(data), std.error = TRUE) | |
'svy.t.test' | t-test adapted to complex survey samples | survey::svyttest(~variable + by, data) | ||
'emmeans' | Estimated Marginal Means or LS-means | adjusted mean difference | survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used. |
Custom Functions
To report a p-value (or difference) for a test not available in gtsummary, you can create a
custom function. The output is a data frame that is one line long. The
structure is similar to the output of broom::tidy()
of a typical
statistical test. The add_p()
and add_difference()
functions will look for columns called
"p.value"
, "estimate"
, "statistic"
, "std.error"
, "parameter"
,
"conf.low"
, "conf.high"
, and "method"
.
You can also pass an Analysis Results Dataset (ARD) object with the results for your custom result. These objects follow the structures outlined by the {cards} and {cardx} packages.
Example calculating a p-value from a t-test assuming a common variance between groups.
ttest_common_variance <- function(data, variable, by, ...) { data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.)) t.test(data[[variable]] ~ factor(data[[by]]), var.equal = TRUE) %>% broom::tidy() } trial[c("age", "trt")] %>% tbl_summary(by = trt) %>% add_p(test = age ~ "ttest_common_variance")
A custom add_difference()
is similar, and accepts arguments conf.level=
and adj.vars=
as well.
ttest_common_variance <- function(data, variable, by, conf.level, ...) { data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.)) t.test(data[[variable]] ~ factor(data[[by]]), conf.level = conf.level, var.equal = TRUE) %>% broom::tidy() }
Function Arguments
For tbl_summary()
objects, the custom function will be passed the
following arguments: custom_pvalue_fun(data=, variable=, by=, group=, type=, conf.level=, adj.vars=)
.
While your function may not utilize each of these arguments, these arguments
are passed and the function must accept them. We recommend including ...
to future-proof against updates where additional arguments are added.
The following table describes the argument inputs for each gtsummary table type.
argument | tbl_summary | tbl_svysummary | tbl_survfit | tbl_continuous |
data= | A data frame | A survey object | A survfit() object | A data frame |
variable= | String variable name | String variable name | NA | String variable name |
by= | String variable name | String variable name | NA | String variable name |
group= | String variable name | NA | NA | String variable name |
type= | Summary type | Summary type | NA | NA |
conf.level= | Confidence interval level | NA | NA | NA |
adj.vars= | Character vector of adjustment variable names (e.g. used in ANCOVA) | NA | NA | Character vector of adjustment variable names (e.g. used in ANCOVA) |
continuous_variable= | NA | NA | NA | String of the continuous variable name |
Available gtsummary themes
Description
The following themes are available to use within the gtsummary package.
Print theme elements with theme_gtsummary_journal(set_theme = FALSE) |> print()
.
Review the themes vignette
for details.
Usage
theme_gtsummary_journal(
journal = c("jama", "lancet", "nejm", "qjecon"),
set_theme = TRUE
)
theme_gtsummary_compact(set_theme = TRUE, font_size = NULL)
theme_gtsummary_printer(
print_engine = c("gt", "kable", "kable_extra", "flextable", "huxtable", "tibble"),
set_theme = TRUE
)
theme_gtsummary_language(
language = c("de", "en", "es", "fr", "gu", "hi", "is", "ja", "kr", "mr", "nl", "no",
"pt", "se", "zh-cn", "zh-tw"),
decimal.mark = NULL,
big.mark = NULL,
iqr.sep = NULL,
ci.sep = NULL,
set_theme = TRUE
)
theme_gtsummary_continuous2(
statistic = "{median} ({p25}, {p75})",
set_theme = TRUE
)
theme_gtsummary_mean_sd(set_theme = TRUE)
theme_gtsummary_eda(set_theme = TRUE)
Arguments
journal |
String indicating the journal theme to follow. One of
|
set_theme |
(scalar |
font_size |
(scalar |
print_engine |
String indicating the print method. Must be one of
|
language |
( If a language is missing a translation for a word or phrase, please feel free to reach out on GitHub with the translated text. |
decimal.mark |
( |
big.mark |
( |
iqr.sep |
( |
ci.sep |
( |
statistic |
Default statistic continuous variables |
Themes
-
theme_gtsummary_journal(journal)
-
"jama"
The Journal of the American Medical AssociationRound large p-values to 2 decimal places; separate confidence intervals with
"ll to ul"
.-
tbl_summary()
Doesn't show percent symbol; use em-dash to separate IQR; runadd_stat_label()
-
tbl_regression()
/tbl_uvregression()
show coefficient and CI in same column
-
"lancet"
The LancetUse mid-point as decimal separator; round large p-values to 2 decimal places; separate confidence intervals with
"ll to ul"
.-
tbl_summary()
Doesn't show percent symbol; use em-dash to separate IQR
-
"nejm"
The New England Journal of MedicineRound large p-values to 2 decimal places; separate confidence intervals with
"ll to ul"
.-
tbl_summary()
Doesn't show percent symbol; use em-dash to separate IQR
-
"qjecon"
The Quarterly Journal of Economics-
tbl_summary()
all percentages rounded to one decimal place -
tbl_regression()
,tbl_uvregression()
add significance stars withadd_significance_stars()
; hides CI and p-value from outputFor flextable and huxtable output, the coefficients' standard error is placed below. For gt, it is placed to the right.
-
-
-
theme_gtsummary_compact()
tables printed with gt, flextable, kableExtra, or huxtable will be compact with smaller font size and reduced cell padding
-
theme_gtsummary_printer(print_engine)
Use this theme to permanently change the default printer.
-
theme_gtsummary_continuous2()
Set all continuous variables to summary type
"continuous2"
by default
-
theme_gtsummary_mean_sd()
Set default summary statistics to mean and standard deviation in
tbl_summary()
Set default continuous tests in
add_p()
to t-test and ANOVA
-
theme_gtsummary_eda()
Set all continuous variables to summary type
"continuous2"
by defaultIn
tbl_summary()
show the median, mean, IQR, SD, and Range by default
Use reset_gtsummary_theme()
to restore the default settings
Review the themes vignette to create your own themes.
See Also
set_gtsummary_theme()
, reset_gtsummary_theme()
Examples
# Setting JAMA theme for gtsummary
theme_gtsummary_journal("jama")
# Themes can be combined by including more than one
theme_gtsummary_compact()
trial |>
select(age, grade, trt) |>
tbl_summary(by = trt) |>
as_gt()
# reset gtsummary themes
reset_gtsummary_theme()
Print tibble with cli
Description
Print a tibble or data frame using cli styling and formatting.
Usage
tibble_as_cli(x, na_value = "", label = list(), padding = 3L)
Arguments
x |
( |
na_value |
( |
label |
(named |
padding |
( |
Examples
trial[1:3, ] |>
dplyr::mutate_all(as.character) |>
gtsummary:::tibble_as_cli()
Results from a simulated study of two chemotherapy agents
Description
A dataset containing the baseline characteristics of 200 patients who received Drug A or Drug B. Dataset also contains the outcome of tumor response to the treatment.
Usage
trial
Format
A data frame with 200 rows–one row per patient
- trt
Chemotherapy Treatment
- age
Age
- marker
Marker Level (ng/mL)
- stage
T Stage
- grade
Grade
- response
Tumor Response
- death
Patient Died
- ttdeath
Months to Death/Censor
Convert character vector to data frame
Description
This is used in some of the selecting we allow for, for example in
as_gt(include=)
you can use tidyselect to select among the call
names to include.
Usage
vec_to_df(x)
Arguments
x |
character vector |
Value
data frame