--- title: "Introduction to genefindr" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to genefindr} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Overview genefindr provides rapid gene characterization by querying eight public databases simultaneously. Instead of manually searching GeneCards, Open Targets, Human Protein Atlas, and PubMed separately, genefindr synthesizes all of this into a single function call. ## A simple example Here we demonstrate basic package functionality by checking if the package loads correctly and that the main functions are available: ```{r} library(genefindr) # Check that functions are available is.function(findr) is.function(findr_multi) ``` The main functions require internet access to query external databases. The following examples show typical usage but are not automatically executed: ## Basic usage The main function is `findr()`. At minimum it requires a gene symbol: ```{r eval=FALSE} library(genefindr) findr("TP53") ``` For disease-specific context, add a `site` or `disease` argument: ```{r eval=FALSE} findr("TP53", site = "breast") findr("APOE", disease = "alzheimer") ``` ## Supported cancer sites The `site` argument accepts the following values: `breast`, `prostate`, `lung`, `colon`, `ovarian`, `liver`, `brain`, `pancreatic`, `skin`, `blood` ## Multiple genes Use `findr_multi()` to characterize several genes at once: ```{r eval=FALSE} findr_multi(c("TP53", "BRCA1", "MYC"), site = "breast") ``` ## Multi-site comparison Compare a gene across multiple cancer types: ```{r eval=FALSE} findr("TP53", site = c("breast", "lung", "colon")) ``` ## Exporting results Save results as a data frame or CSV: ```{r eval=FALSE} results <- findr_multi(c("TP53", "BRCA1"), site = "breast", output = "table") write.csv(results, "candidates.csv") ``` ## Non-coding RNAs genefindr also supports non-coding RNA genes. Protein-based fields are automatically skipped: ```{r eval=FALSE} findr("MALAT1", site = "lung") ``` ## Data sources genefindr integrates data from eight databases: | Database | Data provided | |----------|--------------| | MyGene.info | Gene name, type, summary | | Open Targets | Disease association scores | | Human Protein Atlas | Protein evidence, antibody availability | | UniProt | Molecular weight, subcellular location, isoforms | | GTEx | Normal tissue expression | | cBioPortal/TCGA | Tumor mutation frequency | | PubMed | Publication counts | | ClinVar | Clinical variant counts | ## Notes - All data sources are free and open access - Results reflect the current state of each database at time of query - Mutation frequency data is sourced from TCGA PanCancer Atlas 2018 - For genes with multiple isoforms, verify that your antibody targets the correct isoform