---
title: "icdcomorbid_vignette"
author: "April Nguyen"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{icdcomorbid}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
suppressWarnings({
  suppressPackageStartupMessages({
    loadNamespace("knitr") # for opts_chunk only
    library("icdcomorbid")
    library("magrittr")
    })
  })
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Installation

Load the package to access its functions: 

```{r}
library(icdcomorbid)
```

# Adding Decimals to ICD Codes

The `add_decimal` function is used to reintroduce decimal points to ICD codes. This functionality is particularly useful for standardizing ICD code formats, especially in datasets where decimal points have been removed. This removal can lead to inconsistencies in code formatting and hinder the accurate alignment of codes with standardized comorbidity indices.

```{r add-decimal}
# Example ICD code dataframe
df <- data.frame(
  id = c(1, 2, 3),
  icd_1 = c("C509", "D633", "I210"),
  icd_2 = c("D509", "E788", "N183")
)

# Adding decimal to the ICD codes
formatted_df <- add_decimal(df, icd_cols = c("icd_1", "icd_2"))

# Displaying the updated dataframe
print(formatted_df)
```

# Reshaping Data from Long to Wide Format

The `long_to_wide` function reshapes data from a long format (multiple rows per patient) to a wide format (one row per patient with multiple columns for diagnoses). This function supports batch processing to handle large datasets efficiently. By specifying the batch_size parameter, you can control the number of rows processed in each batch. *This tranformation is required before applying `icd_to_comorbid` functions.*

```{r long-to-wide}
# Example long format data with multiple rows per patient
long_data <- data.frame(
  patient_id = c(1, 1, 2, 2, 3),
  icd_1 = c("A01", "A02", "B01", "B02", "C01"),
  icd_2 = c("D01", "E02", "F01", "G02", "H01")
)

# Reshaping the data to wide format
wide_data <- long_to_wide(long_data, idx = "patient_id", icd_cols = c("icd_1", "icd_2"))

# Displaying the reshaped data
print(wide_data)
```

# Comorbidity Calculations with ICD Codes

The `icdcomorbid` R package includes functions to map ICD-9 and ICD-10 codes to standard comorbidity indices. Additionally, users can choose between the Charlson or Quan-Elixhauser comorbidity indices for their analysis. Depending on your data, you can choose the appropriate ICD version and comorbidity index for accurate comorbidity calculations. Batch processing is also supported by specifying the batch_size parameter. Your data should be formatted correctly (i.e., in wide format) before applying these functions.

## Choosing Comorbidity Indices

You can choose between the Charlson or Quan-Elixhauser comorbidity indices for both ICD-9 and ICD-10 codes. Note that different mappings are required for ICD-9 and ICD-10 codes.

* ICD-9 Codes: Use the icd9_to_comorbid function and select the appropriate index such as "charlson9" or "elixhauser9".

* ICD-10 Codes: Use the icd10_to_comorbid function and select the corresponding index such as "charlson10" or "elixhauser10".

### Example: Using ICD-9 Codes

If your dataset contains ICD-9 codes, you can use the `icd9_to_comorbid` function to calculate comorbidities.  

```{r icd9-example}
# Example ICD-9 data
icd9_data <- data.frame(
  patient_id = c(1, 1, 2, 2, 3),
  icd9_code = c("4010", "2500", "4140", "4280", "4930")
)

# Map ICD-9 codes to comorbidities using Charlson index
mapping <- "charlson9"

comorbidities_icd9 <- icd9_to_comorbid(
  df = icd9_data,
  idx = "patient_id",
  icd_cols = "icd9_code",
  mapping = mapping,
  batch_size = 2
)

# Display the comorbidity results
head(comorbidities_icd9)
```

### Example: Using ICD-10 Codes

If your dataset contains ICD-10 codes, you can use the `icd10_to_comorbid` function to calculate comorbidities:

```{r icd10-example}
# Example data with ICD-10 codes
icd10_data <- data.frame(
  patient_id = c(1, 1, 2, 2, 3),
  icd_code = c("E11", "I10", "E11", "I50", "I21")
)
mapping <- "quan_elixhauser10"

# Calculate comorbidities for ICD-10 data using Elixhauser index
icd10_comorbidities <- icd10_to_comorbid(
  df = icd10_data,
  idx = "patient_id",
  icd_cols = "icd_code",
  mapping = mapping,
  batch_size = 2
)

# Display the comorbidity results
head(icd10_comorbidities)
```

### Example: Custom Mapping

```{r custom-mapping}
# Custom mapping
custom_mapping <- list(
  "Hypertension" = c("4010", "4011", "4019"),
  "Diabetes" = c("2500", "2501", "2502")
)

# Map ICD-9 codes to comorbidities using custom mapping
comorbidities_custom <- icd9_to_comorbid(
  df = icd9_data,
  idx = "patient_id",
  icd_cols = "icd9_code",
  mapping = custom_mapping,
  batch_size = 2
)

# Display the comorbidity results
head(comorbidities_custom)
```

# Identifying Episodes of Care

The `episode_of_care` function groups patients into episodes of care, which is useful for analyzing patient treatment over time. 

```{r episode-of-care}
# Example data with admit and discharge dates for DAD and NACRS
dad_data <- data.frame(
    patient_id = c(1, 1, 2),
    dad_admit = as.POSIXct(c("2023-01-01 10:00:00", "2023-02-01 09:00:00", 
    												 "2023-01-15 08:00:00"), tz="UTC"),
    dad_dis = as.POSIXct(c("2023-01-10 15:00:00", "2023-02-10 14:00:00", 
    											 "2023-01-20 12:00:00"), tz="UTC")
)

nacrs_data <- data.frame(
    patient_id = c(1, 2, 2),
    nacrs_admit = as.POSIXct(c("2023-01-15 10:00:00", "2023-01-25 09:00:00", 
    													 "2023-03-01 08:00:00"), tz="UTC"),
    nacrs_dis = as.POSIXct(c("2023-01-20 15:00:00", "2023-01-30 14:00:00", 
    												 "2023-03-05 12:00:00"), tz="UTC")
)

# Creating episodes of care
episodes <- episode_of_care(dad_data, nacrs_data, patient_id_col = "patient_id", 
														dad_visit_date_col = "dad_admit", 
														dad_exit_date_col = "dad_dis", 
														nacrs_visit_date_col = "nacrs_admit", 
														nacrs_exit_date_col = "nacrs_dis")
head(episodes)
```