| Type: | Package | 
| Title: | Heatmap-Integrated Decision Tree Visualizations | 
| Version: | 0.2.1 | 
| Maintainer: | Trang Le <grixor@gmail.com> | 
| Description: | Creates interpretable decision tree visualizations with the data represented as a heatmap at the tree's leaf nodes. 'treeheatr' utilizes the customizable 'ggparty' package for drawing decision trees. | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| RoxygenNote: | 7.1.1 | 
| Depends: | R (≥ 3.5.0) | 
| Imports: | ggparty, ggplot2, partykit, dplyr, ggnewscale, gtable, stats, tidyr, cluster, grid, yardstick, seriation | 
| Suggests: | forcats, knitr, rmarkdown, rpart, testthat | 
| URL: | https://trang1618.github.io/treeheatr/index.html, https://trang1618.github.io/treeheatr-manuscript/ | 
| BugReports: | https://github.com/trang1618/treeheatr/issues | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2020-11-19 20:45:18 UTC; ttle | 
| Author: | Trang Le [aut, cre] (https://trang.page/), Jason Moore [aut] (http://www.epistasisblog.org/), University of Pennsylvania [cph] | 
| Repository: | CRAN | 
| Date/Publication: | 2020-11-19 21:00:03 UTC | 
Align decision tree and heatmap:
Description
Align decision tree and heatmap:
Usage
align_plots(
  dheat,
  dtree,
  heat_rel_height,
  show = c("heat-tree", "heat-only", "tree-only")
)
Arguments
dheat | 
 ggplot2 grob object of the heatmap.  | 
dtree | 
 ggplot2 grob object of the decision tree  | 
heat_rel_height | 
 Relative height of heatmap compared to whole figure (with tree).  | 
show | 
 Character string indicating which components of the decision tree-heatmap should be drawn. Can be 'heat-tree', 'heat-only' or 'tree-only'.  | 
Value
A gtable/grob object of the decision tree (top) and heatmap (bottom).
Performs clustering or features.
Description
Performs clustering or features.
Usage
clust_feat_func(dat, clust_vec, clust_feats = TRUE)
Arguments
dat | 
 Dataframe of the original dataset. Samples may be reordered.  | 
clust_vec | 
 Character vector of variable names to be applied clustering on. Can include class labels.  | 
clust_feats | 
 if TRUE clusters displayed features (passed through 'clust_vec') using the the Gower metric based on the values of all samples and returns the ordered features. When 'clust_samps = FALSE' and 'clust_feats = FALSE', no clustering is performed.  | 
Value
Character vector of reordered features when 'clust_feats == TRUE'.
Performs clustering of samples.
Description
Performs clustering of samples.
Usage
clust_samp_func(leaf_node = NULL, dat, clust_vec, clust_samps = TRUE)
Arguments
leaf_node | 
 Integer value indicating terminal node id.  | 
dat | 
 Dataframe of the original dataset. Samples may be reordered.  | 
clust_vec | 
 Character vector of variable names to be applied clustering on. Can include class labels.  | 
clust_samps | 
 Logical. If TRUE, hierarchical clustering would be performed among samples within each leaf node.  | 
Value
Dataframe of reordered original dataset when clust_samps == TRUE.
Compute decision tree from data set
Description
Compute decision tree from data set
Usage
compute_tree(
  x,
  data_test = NULL,
  target_lab = NULL,
  task = c("classification", "regression"),
  feat_types = NULL,
  label_map = NULL,
  clust_samps = TRUE,
  clust_target = TRUE,
  custom_layout = NULL,
  lev_fac = 1.3,
  panel_space = 0.001
)
Arguments
x | 
 Dataframe or a 'party' or 'partynode' object representing a custom tree. If a dataframe is supplied, conditional inference tree is computed. If a custom tree is supplied, it must follow the partykit syntax: https://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf  | 
data_test | 
 Tidy test dataset. Required if 'x' is a 'partynode' object. If NULL, heatmap displays (training) data 'x'.  | 
target_lab | 
 Name of the column in data that contains target/label information.  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
feat_types | 
 Named vector indicating the type of each features, e.g., c(sex = 'factor', age = 'numeric'). If feature types are not supplied, infer from column type.  | 
label_map | 
 Named vector of the meaning of the target values, e.g., c(‘0' = ’Edible', ‘1' = ’Poisonous').  | 
clust_samps | 
 Logical. If TRUE, hierarchical clustering would be performed among samples within each leaf node.  | 
clust_target | 
 Logical. If TRUE, target/label is included in hierarchical clustering of samples within each leaf node and might yield a more interpretable heatmap.  | 
custom_layout | 
 Dataframe with 3 columns: id, x and y for manually input custom layout.  | 
lev_fac | 
 Relative weight of child node positions according to their levels, commonly ranges from 1 to 1.5. 1 for parent node perfectly in the middle of child nodes.  | 
panel_space | 
 Spacing between facets relative to viewport, recommended to range from 0.001 to 0.01.  | 
Value
A list of results from 'partykit::ctree' or provided custom tree, including fit, estimates, smart layout and terminal data.
Examples
fit_tree <- compute_tree(penguins, target_lab = 'species')
fit_tree$fit
fit_tree$layout
dplyr::select(fit_tree$term_dat, - contains('nodedata'))
Diabetes patient records.
Description
http://archive.ics.uci.edu/ml/datasets/diabetes https://www.kaggle.com/uciml/pima-indians-diabetes-database
Usage
diabetes
Format
A data frame with 768 observations and 9 variables:
Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin,
BMI, DiabetesPedigreeFunction, Age and Outcome.
Draws the heatmap.
Description
Draws the heatmap to be placed below the decision tree.
Usage
draw_heat(
  dat,
  fit,
  feat_types = NULL,
  target_cols = NULL,
  target_lab_disp = fit$target_lab,
  trans_type = c("percentize", "normalize", "scale", "none"),
  clust_feats = TRUE,
  feats = NULL,
  show_all_feats = FALSE,
  p_thres = 0.05,
  cont_legend = FALSE,
  cate_legend = FALSE,
  cont_cols = ggplot2::scale_fill_viridis_c,
  cate_cols = ggplot2::scale_fill_viridis_d,
  panel_space = 0.001,
  target_space = 0.05,
  target_pos = "top"
)
Arguments
dat | 
 Dataframe with samples from original dataset ordered according to the clustering within each leaf node.  | 
fit | 
 party object, e.g., as output from partykit::ctree()  | 
feat_types | 
 Named vector indicating the type of each features, e.g., c(sex = 'factor', age = 'numeric'). If feature types are not supplied, infer from column type.  | 
target_cols | 
 Character vectors representing the hex values of different level colors for targets, defaults to viridis option B.  | 
target_lab_disp | 
 Character string for displaying the label of target label. If not provided, use 'target_lab'.  | 
trans_type | 
 Character string of 'normalize', 'scale' or 'none'. If 'scale', subtract the mean and divide by the standard deviation. If 'normalize', i.e., max-min normalize, subtract the min and divide by the max. If 'none', no transformation is applied. More information on what transformation to choose can be acquired here: https://cran.rstudio.com/package=heatmaply/vignettes/heatmaply.html#data-transformation-scaling-normalize-and-percentize  | 
clust_feats | 
 Logical. If TRUE, performs cluster on the features.  | 
feats | 
 Character vector of feature names to be displayed in the heatmap. If NULL, display features of which P values are less than 'p_thres'.  | 
show_all_feats | 
 Logical. If TRUE, show all features regardless of 'p_thres'.  | 
p_thres | 
 Numeric value indicating the p-value threshold of feature importance. Feature with p-values computed from the decision tree below this value will be displayed on the heatmap.  | 
cont_legend | 
 Function determining the options for legend of continuous variables, defaults to FALSE. If TRUE, use 'guide_colorbar(barwidth = 10, barheight = 0.5, title = NULL)'. Any other ['guides()'](https://ggplot2.tidyverse.org/reference/guides.html) functions would also work.  | 
cate_legend | 
 Function determining the options for legend of categorical variables, defaults to FALSE. If TRUE, use 'guide_legend(title = NULL)'. Any other ['guides()'](https://ggplot2.tidyverse.org/reference/guides.html) functions would also work.  | 
cont_cols | 
 Function determining color scale for continuous variable, defaults to 'scale_fill_viridis_c(guide = cont_legend)'.  | 
cate_cols | 
 Function determining color scale for nominal categorical variable, defaults to 'scale_fill_viridis_d(begin = 0.3, end = 0.9)'.  | 
panel_space | 
 Spacing between facets relative to viewport, recommended to range from 0.001 to 0.01.  | 
target_space | 
 Numeric value indicating spacing between the target label and the rest of the features  | 
target_pos | 
 Character string specifying the position of the target label on heatmap, can be 'top', 'bottom' or 'none'.  | 
Value
A ggplot2 grob object of the heatmap.
Examples
x <- compute_tree(penguins, target_lab = 'species')
draw_heat(x$dat, x$fit)
Draws the conditional decision tree.
Description
Draws the conditional decision tree output from partykit::ctree(), utilizing ggparty geoms: geom_edge, geom_edge_label, geom_node_label.
Usage
draw_tree(
  dat,
  fit,
  term_dat,
  layout,
  target_cols = NULL,
  title = NULL,
  tree_space_top = 0.05,
  tree_space_bottom = 0.05,
  print_eval = FALSE,
  metrics = NULL,
  x_eval = 0,
  y_eval = 0.9,
  task = c("classification", "regression"),
  par_node_vars = list(label.size = 0, label.padding = unit(0.15, "lines"), line_list =
    list(aes(label = splitvar)), line_gpar = list(list(size = 9)), ids = "inner"),
  terminal_vars = list(label.padding = unit(0.25, "lines"), size = 3, col = "white"),
  edge_vars = list(color = "grey70", size = 0.5),
  edge_text_vars = list(color = "grey30", size = 3, mapping = aes(label =
    paste(breaks_label, "*NA")))
)
Arguments
dat | 
 Dataframe with samples from original dataset ordered according to the clustering within each leaf node.  | 
fit | 
 party object, e.g., as output from partykit::ctree()  | 
term_dat | 
 Dataframe for terminal nodes, must include these columns: id, x, y and y_hat.  | 
layout | 
 Dataframe of layout of all nodes, must include these columns: id, x, y and y_hat.  | 
target_cols | 
 Character vectors representing the hex values of different level colors for targets, defaults to viridis option B.  | 
title | 
 Character string for plot title.  | 
tree_space_top | 
 Numeric value to pass to expand for top margin of tree.  | 
tree_space_bottom | 
 Numeric value to pass to expand for bottom margin of tree.  | 
print_eval | 
 Logical. If TRUE, print evaluation of the tree performance.  | 
metrics | 
 A set of metric functions to evaluate decision tree, defaults to common metrics for classification/regression problems. Can be defined with 'yardstick::metric_set'.  | 
x_eval | 
 Numeric value indicating x position to print performance statistics.  | 
y_eval | 
 Numeric value indicating y position to print performance statistics.  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
par_node_vars | 
 Named list containing arguments to be passed to the 'geom_node_label()' call for non-terminal nodes.  | 
terminal_vars | 
 Named list containing arguments to be passed to the 'geom_node_label()' call for terminal nodes.  | 
edge_vars | 
 Named list containing arguments to be passed to the 'geom_edge()' call for tree edges.  | 
edge_text_vars | 
 Named list containing arguments to be passed to the 'geom_edge_label()' call for tree edge annotations.  | 
Value
A ggplot2 grob object of the decision tree.
Examples
x <- compute_tree(penguins, target_lab = 'species')
draw_tree(x$dat, x$fit, x$term_dat, x$layout)
Print decision tree performance according to different metrics.
Description
Print decision tree performance according to different metrics.
Usage
eval_tree(
  dat,
  target_lab = colnames(dat)[1],
  task = c("classification", "regression"),
  metrics = NULL
)
Arguments
dat | 
 Dataframe with truths (column 'target_lab') and estimates (column 'y_hat') of samples from original dataset.  | 
target_lab | 
 Name of the column in data that contains target/label information.  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
metrics | 
 A set of metric functions to evaluate decision tree, defaults to common metrics for classification/regression problems. Can be defined with 'yardstick::metric_set'.  | 
Value
Character string of the decision tree evaluation.
Examples
eval_tree(compute_tree(penguins, target_lab = 'species')$dat)
Galaxy dataset for regression.
Description
Fetched from PMLB.
Usage
galaxy
Format
An object of class data.frame with 323 rows and 5 columns.
Details
#' @format A data frame with 323 observations and 5 variables:
eastwest, northsouth, angle, radialposition
and target (velocity).
https://www.openml.org/d/690
Get color functions from character vectors
Description
Get color functions from character vectors
Usage
get_cols(my_cols, task, guide = FALSE)
Arguments
my_cols | 
 Character vectors of different hex values  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
guide | 
 A function used to create a guide or its name. Inherit from ['ggplot2::guides()'](https://ggplot2.tidyverse.org/reference/guides.html).  | 
Select the important features to be displayed.
Description
Select features with p-value (computed from decision tree) < 'p_thres' or all features if 'show_all_feats == TRUE'.
Usage
get_disp_feats(fit, feat_names, show_all_feats, p_thres)
Arguments
fit | 
 constparty object of the decision tree.  | 
feat_names | 
 Character vector specifying the feature names in dat.  | 
show_all_feats | 
 Logical. If TRUE, show all features regardless of 'p_thres'.  | 
p_thres | 
 Numeric value indicating the p-value threshold of feature importance. Feature with p-values computed from the decision tree below this value will be displayed on the heatmap.  | 
Value
A character vector of feature names.
———————————————————————————— Get the fitted tree depending on the input 'x'.
Description
If 'x' is a data.frame object, computes conditional tree from partkit::ctree(). If 'x' is a partynode object specifying the customized tree, fit 'x' on 'data_test'. If 'x' is a party (or constparty) object specifying the precomputed tree, simply coerce 'x' to have class constparty.
Usage
get_fit(x, ...)
## Default S3 method:
get_fit(x, ...)
## S3 method for class 'partynode'
get_fit(x, data_test, target_lab, ...)
## S3 method for class 'party'
get_fit(x, data_test, target_lab, task, ...)
## S3 method for class 'data.frame'
get_fit(x, data_test, target_lab, ...)
Arguments
x | 
 Dataframe or a 'party' or 'partynode' object representing a custom tree. If a dataframe is supplied, conditional inference tree is computed. If a custom tree is supplied, it must follow the partykit syntax: https://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf  | 
... | 
 Further arguments passed to each method.  | 
data_test | 
 Tidy test dataset. Required if 'x' is a 'partynode' object. If NULL, heatmap displays (training) data 'x'.  | 
target_lab | 
 Name of the column in data that contains target/label information.  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
Value
Fitted object as a list with prepped 'data_test' if available.
Draws and aligns decision tree and heatmap.
Description
heat_tree() alias.
Usage
heat_tree(
  x,
  target_lab = NULL,
  data_test = NULL,
  task = c("classification", "regression"),
  feat_types = NULL,
  label_map = NULL,
  target_cols = NULL,
  target_legend = FALSE,
  clust_samps = TRUE,
  clust_target = TRUE,
  custom_layout = NULL,
  show = "heat-tree",
  heat_rel_height = 0.2,
  lev_fac = 1.3,
  panel_space = 0.001,
  print_eval = (!is.null(data_test)),
  ...
)
treeheatr(
  x,
  target_lab = NULL,
  data_test = NULL,
  task = c("classification", "regression"),
  feat_types = NULL,
  label_map = NULL,
  target_cols = NULL,
  target_legend = FALSE,
  clust_samps = TRUE,
  clust_target = TRUE,
  custom_layout = NULL,
  show = "heat-tree",
  heat_rel_height = 0.2,
  lev_fac = 1.3,
  panel_space = 0.001,
  print_eval = (!is.null(data_test)),
  ...
)
Arguments
x | 
 Dataframe or a 'party' or 'partynode' object representing a custom tree. If a dataframe is supplied, conditional inference tree is computed. If a custom tree is supplied, it must follow the partykit syntax: https://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf  | 
target_lab | 
 Name of the column in data that contains target/label information.  | 
data_test | 
 Tidy test dataset. Required if 'x' is a 'partynode' object. If NULL, heatmap displays (training) data 'x'.  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
feat_types | 
 Named vector indicating the type of each features, e.g., c(sex = 'factor', age = 'numeric'). If feature types are not supplied, infer from column type.  | 
label_map | 
 Named vector of the meaning of the target values, e.g., c(‘0' = ’Edible', ‘1' = ’Poisonous').  | 
target_cols | 
 Character vectors representing the hex values of different level colors for targets, defaults to viridis option B.  | 
target_legend | 
 Logical. If TRUE, target legend is drawn.  | 
clust_samps | 
 Logical. If TRUE, hierarchical clustering would be performed among samples within each leaf node.  | 
clust_target | 
 Logical. If TRUE, target/label is included in hierarchical clustering of samples within each leaf node and might yield a more interpretable heatmap.  | 
custom_layout | 
 Dataframe with 3 columns: id, x and y for manually input custom layout.  | 
show | 
 Character string indicating which components of the decision tree-heatmap should be drawn. Can be 'heat-tree', 'heat-only' or 'tree-only'.  | 
heat_rel_height | 
 Relative height of heatmap compared to whole figure (with tree).  | 
lev_fac | 
 Relative weight of child node positions according to their levels, commonly ranges from 1 to 1.5. 1 for parent node perfectly in the middle of child nodes.  | 
panel_space | 
 Spacing between facets relative to viewport, recommended to range from 0.001 to 0.01.  | 
print_eval | 
 Logical. If TRUE, print evaluation of the tree performance. Defaults to TRUE when 'data_test' is supplied.  | 
... | 
 Further arguments passed to 'draw_tree()' and/or 'draw_heat()'.  | 
Value
A gtable/grob object of the decision tree (top) and heatmap (bottom).
Examples
heat_tree(penguins, target_lab = 'species')
heat_tree(
  x = galaxy[1:100, ],
  target_lab = 'target',
  task = 'regression',
  terminal_vars = NULL,
  tree_space_bottom = 0)
treeheatr(penguins, target_lab = 'species')
treeheatr(
  x = galaxy[1:100, ],
  target_lab = 'target',
  task = 'regression',
  terminal_vars = NULL,
  tree_space_bottom = 0)
Data of three different species of penguins.
Description
Collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long Term Ecological Research Network.
Usage
penguins
Format
A data frame with 344 observations and 7 variables:
species, island, culmen_length_mm, culmen_depth_mm,
flipper_length_mm, body_mass_g and sex.
Gorman KB, Williams TD, Fraser WR (2014). Ecological Sexual Dimorphism and Environmental Variability within a Community of Antarctic Penguins (Genus Pygoscelis). PLoS ONE 9(3): e90081. doi:10.1371/journal.pone.0090081
Details
Fetched from https://github.com/allisonhorst/penguins.
Creates smart node layout.
Description
Create node layout using a bottom-up approach (literally) and overwrites ggparty-precomputed positions in plot_data.
Usage
position_nodes(plot_data, terminal_data, custom_layout, lev_fac, panel_space)
Arguments
plot_data | 
 Dataframe output of 'ggparty:::get_plot_data()'.  | 
terminal_data | 
 Dataframe of terminal node information including id and raw terminal node size.  | 
custom_layout | 
 Dataframe with 3 columns: id, x and y for manually input custom layout.  | 
lev_fac | 
 Relative weight of child node positions according to their levels, commonly ranges from 1 to 1.5. 1 for parent node perfectly in the middle of child nodes.  | 
panel_space | 
 Spacing between facets relative to viewport, recommended to range from 0.001 to 0.01.  | 
Value
Dataframe with 3 columns: id, x and y of smart layout combined with custom_layout.
Apply the predicted tree on either new test data or training data.
Description
Select features with p-value (computed from decision tree) < 'p_thres' or all features if 'show_all_feats == TRUE'.
Usage
prediction_df(fit, task, clust_samps, clust_target)
Arguments
fit | 
 constparty object of the decision tree.  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
clust_samps | 
 Logical. If TRUE, hierarchical clustering would be performed among samples within each leaf node.  | 
clust_target | 
 Logical. If TRUE, target/label is included in hierarchical clustering of samples within each leaf node and might yield a more interpretable heatmap.  | 
Value
A dataframe of prediction values with scaled columns and clustered samples.
———————————————————————————— Prepare dataset
Description
———————————————————————————— Prepare dataset
Usage
prep_data(data, target_lab, task, feat_types = NULL)
Arguments
data | 
 Original data frame with features to be converted to correct types.  | 
target_lab | 
 Name of the column in data that contains target/label information.  | 
task | 
 Character string indicating the type of problem, either 'classification' (categorical outcome) or 'regression' (continuous outcome).  | 
feat_types | 
 Named vector indicating the type of each features, e.g., c(sex = 'factor', age = 'numeric'). If feature types are not supplied, infer from column type.  | 
Value
List of dataframes (training + test) with proper feature types and target name.
Prepares the feature dataframes for tiles.
Description
If R does not recognize a categorical feature (input from user) as factor, converts to factor.
Usage
prepare_feats(dat, disp_feats, feat_types, clust_feats, trans_type)
Arguments
dat | 
 Dataframe with samples from original dataset ordered according to the clustering within each leaf node.  | 
disp_feats | 
 Character vector specifying features to be displayed.  | 
feat_types | 
 Named vector indicating the type of each features, e.g., c(sex = 'factor', age = 'numeric'). If feature types are not supplied, infer from column type.  | 
clust_feats | 
 Logical. If TRUE, performs cluster on the features.  | 
trans_type | 
 Character string of 'normalize', 'scale' or 'none'. If 'scale', subtract the mean and divide by the standard deviation. If 'normalize', i.e., max-min normalize, subtract the min and divide by the max. If 'none', no transformation is applied. More information on what transformation to choose can be acquired here: https://cran.rstudio.com/package=heatmaply/vignettes/heatmaply.html#data-transformation-scaling-normalize-and-percentize  | 
Value
A list of two dataframes (continuous and categorical) from the original dataset.
Print a ggHeatTree object. Adopted from https://github.com/daattali/ggExtra/blob/master/R/ggMarginal.R#L207-L244.
Description
ggHeatTree objects are created from heat_tree(). This is the S3
generic print method to print the result of the scatterplot with its marginal
plots.
Usage
## S3 method for class 'ggHeatTree'
print(x, newpage = is.null(vp), vp = NULL, ...)
Arguments
x | 
 ggHeatTree (gtable grob) object.  | 
newpage | 
 Should a new page (i.e., an empty page) be drawn before the ggHeatTree is drawn?  | 
vp | 
 viewpoint  | 
... | 
 ignored  | 
Performs transformation on continuous variables.
Description
Performs transformation on continuous variables for the heatmap color scales.
Usage
scale_norm(x, trans_type = c("percentize", "normalize", "scale", "none"))
Arguments
x | 
 Numeric vector.  | 
trans_type | 
 Character string of 'normalize', 'scale' or 'none'. If 'scale', subtract the mean and divide by the standard deviation. If 'normalize', i.e., max-min normalize, subtract the min and divide by the max. If 'none', no transformation is applied. More information on what transformation to choose can be acquired here: https://cran.rstudio.com/package=heatmaply/vignettes/heatmaply.html#data-transformation-scaling-normalize-and-percentize  | 
Value
Numeric vector of the transformed 'x'.
Examples
scale_norm(1:5)
scale_norm(1:5, 'normalize')
Determines terminal node position.
Description
Create node layout using a bottom-up approach (literally) and overwrites ggparty-precomputed positions in plot_data.
Usage
term_node_pos(plot_data, dat)
Arguments
plot_data | 
 Dataframe output of 'ggparty:::get_plot_data()'.  | 
dat | 
 Dataframe of prediction values with scaled columns and clustered samples.  | 
Value
Dataframe with terminal node information.
External test dataset. Medical information of Wuhan patients collected between 2020-01-10 and 2020-02-18.
Description
External test dataset. Medical information of Wuhan patients collected between 2020-01-10 and 2020-02-18.
Usage
test_covid
Format
A data frame with 110 observations and 7 XGBoost-selected variables:
PATIENT_ID, Lactate dehydrogenase,
High sensitivity C-reactive protein, (%)lymphocyte,
Admission time, Discharge time and outcome.
An interpretable mortality prediction model for COVID-19 patients. Yan et al. https://doi.org/10.1038/s42256-020-0180-7 https://github.com/HAIRLAB/Pre_Surv_COVID_19
Training dataset. Medical information of Wuhan patients collected between 2020-01-10 and 2020-02-18. Containing NAs.
Description
Training dataset. Medical information of Wuhan patients collected between 2020-01-10 and 2020-02-18. Containing NAs.
Usage
train_covid
Format
A data frame with 375 observations and 77 variables.
An interpretable mortality prediction model for COVID-19 patients. Yan et al. https://doi.org/10.1038/s42256-020-0180-7 https://github.com/HAIRLAB/Pre_Surv_COVID_19
Results of a chemical analysis of wines grown in a specific area of Italy.
Description
Three types of wine are represented in the 178 samples, with the results of 13 chemical analyses recorded for each sample.
Usage
wine
Format
A data frame with 178 observations and 14 variables:
Alcohol, Malic, Ash, Alcalinity,
Magnesium, Phenols, Flavanoids, Nonflavanoids,
Proanthocyanins, Color, Hue, Dilution, Proline
and Type (target).
Details
Import with data(wine, package = 'rattle'). Dependent variable: Type. https://rdrr.io/cran/rattle.data/man/wine.html http://archive.ics.uci.edu/ml/datasets/wine
Red variant of the Portuguese "Vinho Verde" wine.
Description
Fetched from PMLB. Physicochemical and quality of wine.
Usage
wine_quality_red
Format
A data frame with 1599 observations and 12 variables:
fixed.acidity, volatile.acidity,
citric.acid, residual.sugar, chlorides, free.sulfur.dioxide,
total.sulfur.dioxide, density, pH, sulphates,
alcohol and target (quality).
http://archive.ics.uci.edu/ml/datasets/Wine+Quality
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.