--- title: "Step 2. Obtain the sequence ratios" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{a03_Summarise_sequence_ratios} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( warning = FALSE, message = FALSE, collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, eval = Sys.getenv("$RUNNER_OS") != "macOS" ) ``` ```{r, include = FALSE} if (Sys.getenv("EUNOMIA_DATA_FOLDER") == "") Sys.setenv("EUNOMIA_DATA_FOLDER" = tempdir()) if (!dir.exists(Sys.getenv("EUNOMIA_DATA_FOLDER"))) dir.create(Sys.getenv("EUNOMIA_DATA_FOLDER")) if (!CDMConnector::eunomia_is_available()) CDMConnector::downloadEunomiaData() ``` # Introduction In this vignette we will explore the functionality and arguments of `summariseSequenceRatios()` function, which is used to generate the sequence ratios of the SSA. As this function uses the output of `generateSequenceCohortSet()` function (explained in detail in the vignette: **Step 1. Generate a sequence cohort**), we will pick up the explanation from where we left off in the previous vignette. ```{r message= FALSE, warning=FALSE, include=FALSE} # Load libraries library(CDMConnector) library(dplyr) library(DBI) library(CohortSymmetry) library(duckdb) library(DrugUtilisation) # Connect to the database db <- DBI::dbConnect(duckdb::duckdb(), dbdir = CDMConnector::eunomia_dir()) cdm <- cdm_from_con( con = db, cdm_schema = "main", write_schema = "main" ) cdm <- DrugUtilisation::generateIngredientCohortSet( cdm = cdm, name = "aspirin", ingredient = "aspirin") cdm <- DrugUtilisation::generateIngredientCohortSet( cdm = cdm, name = "acetaminophen", ingredient = "acetaminophen") ``` Recall that in the previous vignette: Step 1. Generate a sequence cohort, we've generated `cdm$aspirin` and `cdm$acetaminophen` before and using them we could generate `cdm$intersect` like so: ```{r message= FALSE, warning=FALSE} # Generate a sequence cohort cdm <- generateSequenceCohortSet( cdm = cdm, indexTable = "aspirin", markerTable = "acetaminophen", name = "intersect", combinationWindow = c(0,Inf)) ``` # Obtain sequence ratios One can obtain the crude and adjusted sequence ratios (with its corresponding confidence intervals) using `summariseSequenceRatios()` function: ```{r message = FALSE, warning = FALSE} summariseSequenceRatios( cohort = cdm$intersect ) |> dplyr::glimpse() ``` The obtained output has a summarised result format. In the later vignette (**Step 3. Visualise results**) we will explore how to visualise the results in a more intuitive way. ## Modify the cohort based on `cohort_definition_id` This parameter is used to subset the cohort table inputted to the `summariseSequenceRatios()`. Imagine the user only wants to include `cohort_definition_id` $= 1$ from `cdm$intersect` in the `summariseSequenceRatios()`, then one could do the following: ```{r message= FALSE, warning=FALSE} summariseSequenceRatios(cohort = cdm$intersect, cohortId = 1) |> dplyr::glimpse() ``` Of course in this case this does nothing because every entry in `cdm$intersect` has `cohort_definition_id` $= 1$. ## Modify `confidenceInterval` By default, the `summariseSequenceRatios()` function will use 95% (two-sided) confidence interval. If another confidence interval is desired, for example 99% confidence interval, one can use the `confidenceInterval` argument: ```{r message = FALSE, warning = FALSE} summariseSequenceRatios( cohort = cdm$intersect, confidenceInterval = 99) |> dplyr::glimpse() ``` ## Modify `movingAverageRestriction` The idea of moving average restriction is necessary only for the null sequence ratio calculation, please refer to Lai et al. (2017) for more details on this parameter (parameter d when calculating P in page 578). Following Tsiropoulos et al. (2009), by default, the argument `movingAverageRestriction` is set to be $548$ ($18$ months). Should one wish to modify this, one could do something like: ```{r message = FALSE, warning = FALSE} summariseSequenceRatios( cohort = cdm$intersect, movingAverageRestriction = 600) |> dplyr::glimpse() ``` ## Modify `minCellCount` By default, the minimum number of events to reported is 5, below which results will be obscured. If 0, all results will be reported and the user could do this via: ```{r message= FALSE, warning=FALSE} summariseSequenceRatios(cohort = cdm$intersect, minCellCount = 0) |> dplyr::glimpse() ``` ```{r message= FALSE, warning=FALSE, eval=FALSE} CDMConnector::cdmDisconnect(cdm = cdm) ```