--- title: "Codelist diagnostics" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{CodelistDiagnostics} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE, fig.width = 7 ) library(CDMConnector) if (Sys.getenv("EUNOMIA_DATA_FOLDER") == "") Sys.setenv("EUNOMIA_DATA_FOLDER" = tempdir()) if (!dir.exists(Sys.getenv("EUNOMIA_DATA_FOLDER"))) dir.create(Sys.getenv("EUNOMIA_DATA_FOLDER")) if (!eunomiaIsAvailable()) downloadEunomiaData(datasetName = "synpuf-1k") ``` ## Introduction In this example we're going to summarise the characteristics of individuals with an ankle sprain, ankle fracture, forearm fracture, a hip fracture and different measurements using the Eunomia synthetic data. We'll begin by creating our study cohorts. ```{r} library(CDMConnector) library(CohortConstructor) library(CodelistGenerator) library(PhenotypeR) library(MeasurementDiagnostics) library(dplyr) library(ggplot2) con <- DBI::dbConnect(duckdb::duckdb(), CDMConnector::eunomiaDir("synpuf-1k", "5.3")) cdm <- CDMConnector::cdmFromCon(con = con, cdmName = "Eunomia Synpuf", cdmSchema = "main", writeSchema = "main", achillesSchema = "main") cdm$injuries <- conceptCohort(cdm = cdm, conceptSet = list( "ankle_sprain" = 81151, "ankle_fracture" = 4059173, "forearm_fracture" = 4278672, "hip_fracture" = 4230399, "measurements_cohort" = c(40660437L, 2617206L, 4034850L, 2617239L, 4098179L) ), name = "injuries") cdm$injuries |> glimpse() ``` ## Summarising code use To get a good understanding of the codes we've used to define our cohorts we can use the `codelistDiagnostics()` function. ```{r} code_diag <- codelistDiagnostics(cdm$injuries) ``` Codelist diagnostics builds on [CodelistGenerator](https://darwin-eu.github.io/CodelistGenerator/) and [MeasurementDiagnostics](https://ohdsi.github.io/MeasurementDiagnostics/) R packages to perform the following analyses: - **Achilles code use:** Which summarises the counts of our codes in our database based on achilles results using [summariseAchillesCodeUse()](https://darwin-eu.github.io/CodelistGenerator/reference/summariseAchillesCodeUse.html). - **Orphan code use:** Orphan codes refer to codes that we did not include in our cohort definition, but that have any relationship with the codes in our codelist. So, although many can be false positives, we may identify some codes that we may want to use in our cohort definitions. This analysis uses [summariseOrphanCodes()](https://darwin-eu.github.io/CodelistGenerator/reference/summariseOrphanCodes.html). - **Cohort code use:** Summarises the cohort code use in our cohort using [summariseCohortCodeUse()](https://darwin-eu.github.io/CodelistGenerator/reference/summariseCohortCodeUse.html). - **Measurement diagnostics:** If any of the concepts used in our codelist is a measurement, it summarises its code use using [summariseCohortMeasurementUse()](https://ohdsi.github.io/MeasurementDiagnostics/reference/summariseCohortMeasurementUse.html). The output of a function is a summarised result table. ### Add codelist attribute Some cohorts that may be created manually may not have the codelists recorded in the `cohort_codelist` attribute. The package has a utility function to record a codelist in a `cohort_table` object: ```{r} cohortCodelist(cdm$injuries, cohortId = 1) cdm$injuries <- cdm$injuries |> addCodelistAttribute(codelist = list(new_codelist = c(1L, 2L)), cohortName = "ankle_fracture") cohortCodelist(cdm$injuries, cohortId = 1) ``` ## Visualise the results We will now use different functions to visualise the results generated by CohortDiagnostics. Notice that these functions are from [CodelistGenerator](https://darwin-eu.github.io/CodelistGenerator/) and [MeasurementDiagnostics](https://ohdsi.github.io/MeasurementDiagnostics/) R packages packages. ### Achilles code use ```{r} tableAchillesCodeUse(code_diag) ``` ### Orphan code use ```{r} tableOrphanCodes(code_diag) ``` ### Cohort code use ```{r} tableCohortCodeUse(code_diag) ``` ### Measurement timings ```{r} tableMeasurementTimings(code_diag) ``` ```{r} plotMeasurementTimings(code_diag) ``` ### Measurement value as concept ```{r} tableMeasurementValueAsConcept(code_diag) ``` ```{r} plotMeasurementValueAsConcept(code_diag) ``` ### Measurement value as numeric ```{r} tableMeasurementValueAsNumeric(code_diag) ``` ```{r} plotMeasurementValueAsNumeric(code_diag) ```