--- title: "Layout Customization" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Layout Customization} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` The package offers a suite of `align_*` functions designed to give you precise control over plot layout. These functions enable you to manipulate axis order within the layout and partition along an axis into multiple panels. Currently, there are four key `align_*` functions available for layout customization: - **`align_group`**: Group and align plots based on categorical factors. - **`align_reorder`**: Reorder plots or split axes into different panels. - **`align_kmeans`**: Arrange plots by k-means clustering results. - **`align_dendro`**: Align plots according to hierarchical clustering or dendrograms. ```{r setup} library(ggalign) ``` ```{r setup_data} set.seed(123) small_mat <- matrix(rnorm(81), nrow = 9) rownames(small_mat) <- paste0("row", seq_len(nrow(small_mat))) colnames(small_mat) <- paste0("column", seq_len(ncol(small_mat))) ``` ## `align_group` The `align_group()` function allows you to group rows/columns into separate panels. It doesn't add any plot area. ```{r align_group_top} ggheatmap(small_mat) + hmanno("t") + align_group(sample(letters[1:4], ncol(small_mat), replace = TRUE)) ``` By default, the facet strip text is removed. You can override this behavior with `theme(strip.text = element_text())`. Since `align_group()` does not create a new plot, the panel title can only be added to the heatmap plot. ```{r align_group_left} ggheatmap(small_mat) + theme(strip.text = element_text()) + hmanno("l") + align_group(sample(letters[1:4], nrow(small_mat), replace = TRUE)) ``` ## `align_reorder` The `align_reorder()` function reorders the rows/columns based on a summary function. Like `align_group()`, it doesn't add a plot area. Here, we reorder the rows based on the means. ```{r align_reorder} ggheatmap(small_mat) + hmanno("l") + align_reorder(rowMeans) ``` By default, `align_reorder()` reorders the rows or columns in ascending order of the summary function's output (from bottom to top for rows, or from left to right for columns). To reverse this order, you can set `decreasing = TRUE`: ```{r align_reorder_decreasing} ggheatmap(small_mat) + hmanno("l") + align_reorder(rowMeans, decreasing = TRUE) ``` Some `align_*` functions accept a `data` argument. This can be a matrix, a data frame, or even a simple vector, which will be converted into a one-column matrix. If the `data` argument is `NULL`, the function will use the layout data, as demonstrated in the previous example. The `data` argument can also accept a function (purrr-like lambda syntax is supported), which will be applied to the layout data. > Note: All `align_*` functions treat rows as observations, meaning that `NROW()` function must return the same number as the parallel layout axis. For heatmap column annotations, the heatmap matrix is transposed before being used. If `data` is a function, it will be applied to the transposed matrix. Even for top and bottom annotations, you can use `rowMeans()` to calculate the mean value across all columns. ```{r} ggheatmap(small_mat) + hmanno("t") + align_reorder(rowMeans) ``` ## `align_kmeans` The `align_kmeans()` function groups heatmap rows or columns based on k-means clustering. Like the previous functions, it does not add a plot area. ```{r} ggheatmap(small_mat) + hmanno("t") + align_kmeans(3L) ``` It's important to note that `align_group()` and `align_kmeans()` cannot do sub-grouping. This means they cannot be used if groups already exist. ```{r error=TRUE} ggheatmap(small_mat) + hmanno("t") + align_group(sample(letters[1:4], ncol(small_mat), replace = TRUE)) + align_kmeans(3L) ``` ```{r error=TRUE} ggheatmap(small_mat) + hmanno("t") + align_kmeans(3L) + align_group(sample(letters[1:4], ncol(small_mat), replace = TRUE)) ``` ## align_dendro The `align_dendro()` function adds a dendrogram to the layout and can also reorder or split the layout based on hierarchical clustering. This is particularly useful for working with heatmap plots. ```{r align_dendro} ggheatmap(small_mat) + hmanno("t") + align_dendro() ``` Hierarchical clustering is performed in two steps: calculate the distance matrix and apply clustering. You can use the `distance` and `method` argument to control the dendrogram builind process. There are two ways to specify `distance` metric for clustering: - specify `distance` as a pre-defined option. The valid values are the supported methods in `dist()` function and coorelation coefficient `"pearson"`, `"spearman"` and `"kendall"`. The correlation distance is defined as `1 - cor(x, y, method = distance)`. - a self-defined function which calculates distance from a matrix. The function should only contain one argument. Please note for clustering on columns, the matrix will be transposed automatically. ```{r align_dendro_distance_pearson} ggheatmap(small_mat) + hmanno("t") + align_dendro(distance = "pearson") + patch_titles(top = "pre-defined distance method (1 - pearson)") ``` ```{r align_dendro_distance_function} ggheatmap(small_mat) + hmanno("t") + align_dendro(distance = function(m) dist(m)) + patch_titles(top = "a function that calculates distance matrix") ``` Method to perform hierarchical clustering can be specified by `method`. Possible methods are those supported in `hclust()` function. And you can also provide a self-defined function, which accepts the distance object and return a `hclust` object. ```{r} ggheatmap(small_mat) + hmanno("t") + align_dendro(method = "ward.D2") ``` The dendrogram can also be used to cut the columns/rows into groups. You can specify `k` or `h`, which work similarly to `cutree()`: ```{r} ggheatmap(small_mat) + hmanno("t") + align_dendro(k = 3L) ``` In contrast to `align_group()`, `align_kmeans()`, and `align_reorder()`, `align_dendro()` is capable of drawing plot components. So it has a default `set_context` value of `TRUE`, meaning it will set the active context of the annotation stack layout. In this way, we can add any ggplot elements to this plot area. ```{r} ggheatmap(small_mat) + hmanno("t") + align_dendro() + geom_point(aes(y = y)) ``` The `align_dendro()` function creates default `node` data for the ggplot. See `ggplot2 specification` in `?align_dendro` for details. Additionally, `edge` data is added to the `ggplote::geom_segment()` layer directly, used to draw the dendrogram tree. One useful variable in both `node` and `edge` data is the `branch` column, corresponding to the `cutree` result: ```{r} ggheatmap(small_mat) + hmanno("t") + align_dendro(aes(color = branch), k = 3) + geom_point(aes(color = branch, y = y)) ``` `align_dendro()` can also perform clustering between groups, meaning it can be used even if there are existing groups present in the layout: ```{r} column_groups <- sample(letters[1:3], ncol(small_mat), replace = TRUE) ggheatmap(small_mat) + hmanno("t") + align_group(column_groups) + align_dendro(aes(color = branch)) ``` You can reorder the groups by setting `reorder_group = TRUE`. ```{r} ggheatmap(small_mat) + hmanno("t") + align_group(column_groups) + align_dendro(aes(color = branch), reorder_group = TRUE) ``` You can see the difference by drawing two dendrogram. ```{r} ggheatmap(small_mat) + hmanno("t") + align_group(column_groups) + align_dendro(aes(color = branch), reorder_group = TRUE) + hmanno("b") + align_dendro(aes(color = branch), reorder_group = FALSE) ``` ## Session information ```{r} sessionInfo() ```