--- title: "Derived variables" format: html vignette: > %\VignetteIndexEntry{Derived variables} %\VignetteEngine{quarto::html} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ```{r, warning=FALSE, message=FALSE} library(chmsflow) ``` ## Introduction There are two types of derived variables in the CHMS surveys. Both are supported in chmsflow. - **Variable mapping** -- mapping two or more variables into a single variable. - **Computed variables** -- variables derived using mathematical equations or clinical logic. chmsflow computes derived variables using functions referenced in `variable-details.csv`. The `recEnd` column uses the prefix `Func::` to name the R function, and the `variableStart` column uses the prefix `DerivedVar::` to list the input variables. For example, GFR (`gfr_ml_min`) has: - `recEnd`: `Func::calculate_gfr` - `variableStart`: `DerivedVar::[lab_bcre, pgdcgt, clc_sex, clc_age]` This tells `rec_with_table()` to call `calculate_gfr()` with the four input variables. ## How to use derived variables Since derived variables depend on their input variables, you must list both the derived variable and its inputs when calling `rec_with_table()`: ```{r, warning=FALSE, eval=FALSE} cycle2_gfr <- recodeflow::rec_with_table( cycle2, variables = c("lab_bcre", "pgdcgt", "clc_sex", "clc_age", "gfr_ml_min"), variable_details = variable_details, log = TRUE ) ``` For variables that depend on medication status (e.g., hypertension, diabetes), use `recode_after_meds()` instead of `rec_with_table()`. See [Recoding medications](recoding_medications.html) and [Analysis walkthrough](analysis_walkthrough.html) for the full workflow. ## Creating a derived variable To add a new derived variable to chmsflow, you need to create a harmonized set of input variables and an R function that computes the derived value. See [How to add variables](how_to_add_variables.html) for step-by-step instructions. For details on the metadata schema, see [Variable schema reference](variables_and_variable_details.html). ## Next steps - **See derived variables in a full analysis** -- The [Analysis walkthrough](analysis_walkthrough.html) demonstrates deriving hypertension status from CHMS cycle 3 data. - **Handle missing data** -- Learn how `tagged_na()` codes propagate through derived variable functions in [Missing data (tagged_na)](tagged_na_usage.html). - **Understand the methodology** -- For the design rationale behind the rules-as-data approach, see [Methodology](methodology.html).