SeuratExplorer

Lifecycle: stable

Why build this R package

Currently, there is still no good tools for visualising the analysis results from Seurat, when the bioinformatics analyst hands over the results to the user, if the user does not have any R language foundation, it is still difficult to retrieve the results and re-analysis on their own, and this R package is designed to help such users to visualize and explore the anaysis results. The only thing to do for such users is to configure R and Rstudio on their own computers, and then install SeuratExplorer, without any other operations, an optional way is to upload the Seurat object file to a server which has been deployed with shinyserver and SeuratExplorer.

Essentially, what SeuratExplorer done is just to perform visual operations for command line tools from Seurat or other packages.

A live demo webserver

Upload an Rds or qs2 file, with file size no more than 5GB, to Demo Site. You can download a mini demo data from github.

Installation

You can use codes bellow to install the latest version of SeuratExplorer:

# install dependency
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("ComplexHeatmap")

# install SeuratExplorer
if(!require(devtools)){install.packages("devtools")}
install_github("fentouxungui/SeuratExplorer")

Run App

library(SeuratExplorer)
launchSeuratExplorer()

Introduction

Load data

Cell Metadata

Dimensional Reduction Plot

Example plots:

Feature Plot

Example plots:

Violin Plot

Example plots:

Dot Plot

Example plots:

Heatmap for cell level expression

Example plots:

Heatmap for group averaged expression

Example plots:

Ridge Plot

Example plots:

Plot Cell Percentage

Example plots:

Find Cluster Markers and DEGs Analysis

This usually takes longer, please wait patiently.Please save the results before start a new analysis, the old results will be overwritten by the new results, the results can be downloaded as csv format.

Support two ways

You can modify part calculation parameters before a analysis.

Screen shots:

Output description

FindMarkers(object, …)

A data.frame with a ranked list of putative markers as rows, and associated statistics as columns (p-values, ROC score, etc., depending on the test used (test.use)). The following columns are always present:

avg_logFC: log fold-chage of the average expression between the two groups. Positive values indicate that the gene is more highly expressed in the first group

pct.1: The percentage of cells where the gene is detected in the first group

pct.2: The percentage of cells where the gene is detected in the second group

p_val_adj: Adjusted p-value, based on bonferroni correction using all genes in the dataset

Top Expressed Features

Highly expressed genes can reflect the main functions of cells, there two ways to do this. the first - Find Top Genes by Cell could find gene only high express in a few cells, while the second - Find Top Genes by Accumulated UMI counts is biased to find the highly expressed genes in most cells by accumulated UMI counts.

1. Find Top Genes by Cell

How?

Step1: for each cell, find genes that has high UMI percentage, for example, if a cell has 10000 UMIs, and the UMI percentage cutoff is set to 0.01, then all genes that has more than 10000 * 0.01 = 100 UMIs is thought to be the highly expressed genes for this cell.

Step2: summary those genes for each cluster, firstly get all highly expressed genes in a cluster, some genes may has less cells, then for each gene, count cells in which this genes is highly expressed, and also calculate the mean and median UMI percentage in those highly expressed cells.

Output description

2. Find Top Genes by Mean UMI counts

for each cluster, calculate the top n highly expressed genes by Mean UMI counts. if a cluster has less than 3 cells, this cluster will be escaped.

Output description

Feature Summary

Summary interested features by cluster, such as the positive cell percentage and mean/median expression level.

Output description

Feature Correlation Analysis

Can calculate the correlation value of gene pairs within cells from a cluster, support pearson & spearman methods.

3 ways to do

Output description

if nothing return, this is because the input genes has very low expression level, very low expressed genes will be removed before analysis.

Session Info

#> R version 4.4.1 (2024-06-14 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22631)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=Chinese (Simplified)_China.utf8 
#> [2] LC_CTYPE=Chinese (Simplified)_China.utf8   
#> [3] LC_MONETARY=Chinese (Simplified)_China.utf8
#> [4] LC_NUMERIC=C                               
#> [5] LC_TIME=Chinese (Simplified)_China.utf8    
#> 
#> time zone: Asia/Shanghai
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_4.4.1    fastmap_1.2.0     cli_3.6.3         tools_4.4.1      
#>  [5] htmltools_0.5.8.1 rstudioapi_0.16.0 yaml_2.3.8        rmarkdown_2.27   
#>  [9] highr_0.11        knitr_1.47        xfun_0.45         digest_0.6.36    
#> [13] rlang_1.1.4       evaluate_0.24.0