This vignette provides a walkthrough of the annotaR
package, demonstrating how to perform a multi-layered annotation of a
gene list.
First, we define a character vector of our genes of interest. For
this example, we use a small list of well-known cancer-related genes.
Then, we initialize the pipeline with the annotaR()
function.
# A small list of well-known genes involved in cancer
genes_of_interest <- c(
"TP53", "EGFR", "BRCA1", "BRCA2", "KRAS", "PIK3CA", "AKT1", "BRAF",
"MYC", "ERBB2", "CDKN2A", "PTEN"
)
# Create the initial object
annotaR_obj <- annotaR(genes_of_interest)
print(annotaR_obj)
#> # A tibble: 12 × 1
#> gene
#> <chr>
#> 1 TP53
#> 2 EGFR
#> 3 BRCA1
#> 4 BRCA2
#> 5 KRAS
#> 6 PIK3CA
#> 7 AKT1
#> 8 BRAF
#> 9 MYC
#> 10 ERBB2
#> 11 CDKN2A
#> 12 PTENThe power of annotaR comes from its pipe-friendly,
layered approach. We can chain functions together to progressively add
data. Here, we add Gene Ontology (GO) terms, disease associations, and
known drug links.
# Note: The following steps query live APIs and may take a few moments.
full_annotation <- annotaR_obj %>%
add_go_terms(sources = c("GO:BP")) %>%
add_disease_links() %>%
add_drug_links()
# Take a look at the resulting tidy data frame
# Use `head()` to show just the first few rows
head(full_annotation)
#> # A tibble: 6 × 11
#> gene term_id term_name p_value source disease_name association_score
#> <chr> <chr> <chr> <dbl> <chr> <chr> <dbl>
#> 1 TP53 GO:0006915 apoptotic pro… 4.26e-10 GO:BP Li-Fraumeni… 0.876
#> 2 TP53 GO:0006915 apoptotic pro… 4.26e-10 GO:BP Li-Fraumeni… 0.876
#> 3 TP53 GO:0006915 apoptotic pro… 4.26e-10 GO:BP Li-Fraumeni… 0.876
#> 4 TP53 GO:0006915 apoptotic pro… 4.26e-10 GO:BP Li-Fraumeni… 0.876
#> 5 TP53 GO:0006915 apoptotic pro… 4.26e-10 GO:BP Li-Fraumeni… 0.876
#> 6 TP53 GO:0006915 apoptotic pro… 4.26e-10 GO:BP Li-Fraumeni… 0.876
#> # ℹ 4 more variables: drug_name <chr>, drug_type <chr>,
#> # mechanism_of_action <chr>, phase <int>After annotating, we can easily visualize the results. The
plot_enrichment_dotplot() function creates a
publication-ready plot for the GO enrichment data.
# The plot function uses the data from the `add_go_terms` step
plot_enrichment_dotplot(
full_annotation,
n_terms = 20,
title = "Top 20 Enriched GO Biological Processes"
)