--- title: "Summarise database characteristics" output: html_document: pandoc_args: [ "--number-offset=1,0" ] number_sections: yes toc: yes vignette: > %\VignetteIndexEntry{database characteristics} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Introduction ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` In this vignette, we explore how the *OmopSketch* function `databaseCharacteristics()` and `shinyCharacteristics()` can serve as a valuable tool for characterising databases containing electronic health records mapped to the OMOP Common Data Model. ## Create a mock cdm We begin by loading the necessary packages and creating a mock CDM using the `mockOmopSketch()` function: ```{r, warning=FALSE} library(dplyr) library(OmopSketch) cdm <- mockOmopSketch() cdm ``` # Summarise database characteristics The `databaseCharacteristics()` function provides a comprehensive summary of the CDM, returning a [summarised result](https://darwin-eu-dev.github.io/omopgenerics/articles/summarised_result.html) that includes: - A general database snapshot, using `summariseOmopSnapshot()` - A characterisation of the population in observation, built using the [CohortConstructor](https://ohdsi.github.io/CohortConstructor/) and [CohortCharacteristics](https://darwin-eu.github.io/CohortCharacteristics/) packages - A summary of the observation period table using `summariseObservationPeriod()` and `summariseInObservation()` - A data quality assessment of the clinical tables using `summariseMissingData()` - A characterisation of the clinical tables with `summariseClinicalRecords()` and `summariseRecordCount()` ```{r, eval = FALSE} result <- databaseCharacteristics(cdm) ``` ## Selecting tables to characterise By default, the following OMOP tables are included in the characterisation: *person*, *observation_period*, *visit_occurrence*, *condition_occurrence*, *drug_exposure*, *procedure_occurrence*, *device_exposure*, *measurement*, *observation*, *death*. You can customise which tables to include in the analysis by specifying them with the `omopTableName` argument. ```{r, eval=FALSE} result <- databaseCharacteristics(cdm, omopTableName = c("drug_exposure", "condition_occurrence")) ``` ## Stratifying by Sex To stratify the characterisation results by sex, set the `sex` argument to `TRUE`: ```{r, eval=FALSE} result <- databaseCharacteristics(cdm, omopTableName = c("drug_exposure", "condition_occurrence"), sex = TRUE) ``` ## Stratifying by Age Group You can choose to characterise the data stratifying by age group by creating a list defining the age groups you want to use. ```{r, eval=FALSE} result <- databaseCharacteristics(cdm, omopTableName = c("drug_exposure", "condition_occurrence"), ageGroup = list(c(0,50), c(51,100))) ``` ## Filtering by date range and time interval Use the `dateRange` argument to limit the analysis to a specific period. Combine it with the `interval` argument to stratify results by time. Valid values for interval include "overall" (default), "years", "quarters", and "months": ```{r, eval=FALSE} result <- databaseCharacteristics(cdm, interval = "years", dateRange = as.Date(c("2010-01-01", "2018-12-31"))) ``` ## Including Concept Counts To include concept counts in the characterisation, set `conceptIdCounts = TRUE`: ```{r, eval=FALSE} result <- databaseCharacteristics(cdm, conceptIdCounts = TRUE) ``` # Visualise the characterisation results To explore the characterisation results interactively, you can use the `shinyCharacteristics()` function. This function generates a Shiny application in the specified `directory`, allowing you to browse, filter, and visualise the results through an intuitive user interface. ```{r, eval=FALSE} shinyCharacteristics(result = result, directory = "path/to/your/shiny") ``` ## Customise the Shiny App You can customise the title, logo, and theme of the Shiny app by setting the appropriate arguments: - `title`: The title displayed at the top of the app - `logo`: Path to a custom logo (must be in SVG format) - `theme`: A custom Bootstrap theme (e.g., using bslib::bs_theme()) ```{r, eval=FALSE} shinyCharacteristics(result = result, directory = "path/to/my/shiny", title = "Characterisation of my data", logo = "path/to/my/logo.svg", theme = "bslib::bs_theme(bootswatch = 'flatly')") ``` An example of the Shiny application generated by `shinyCharacteristics()` can be explored [here](https://dpa-pde-oxford.shinyapps.io/OmopSketchCharacterisation/), where the characterisation of several synthetic datasets is available.