--- title: "Getting Started with climatehealth" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with climatehealth} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## What is climatehealth? The **climatehealth** package provides R functions for calculating climate–health indicators following the statistical framework developed under the [SOSCHI (Standards for Official Statistics on Climate–Health Interactions)](https://climate-health.officialstatistics.org) project. It covers indicators for six climate-health topic areas: | Topic | Lead | |---|---| | Temperature-related health effects | ONS | | Health effects of wildfires | ONS | | Mental health (suicides and heat) | ONS | | Water-borne diseases (diarrhoea) | AIMS | | Health effects of air pollution | AIMS | | Vector-borne diseases (malaria) | RIPS/AIMS | Each topic has a dedicated analysis function that takes a data file path and column mappings, fits the appropriate statistical models, and returns results and optional plots. --- ## Installation ### From CRAN ```{r install-cran} install.packages("climatehealth") ``` ### From GitHub (latest development version) ```{r install-github} install.packages("remotes") remotes::install_github("onssoschi/climatehealth") ``` ### Optional dependencies Two indicators (malaria and diarrhoea) depend on **INLA** and **terra** respectively, which are not on CRAN and must be installed separately if needed: ```{r install-optional} climatehealth::install_INLA() climatehealth::install_terra() ``` Once installed, load the package: ```{r load} library(climatehealth) ``` --- ## Package workflow All six indicator functions follow the same pattern: 1. **Provide a path** to your input CSV. 2. **Map your column names** to the function's expected arguments (or use the defaults if your data already matches them). 3. **Choose optional extras**: covariates, meta-analysis, output saving. 4. **Inspect the returned list** for model results, plots, and summary tables. ``` your_data.csv --> indicator_do_analysis() --> results list --> figures (optional) --> CSV outputs (optional) ``` --- ## Your first analysis: temperature and mortality `temp_mortality_do_analysis()` estimates the association between ambient temperature and mortality using a distributed lag non-linear model (DLNM). ```{r temp-mortality-basic} res <- climatehealth::temp_mortality_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", temperature_col = "tmean", health_outcome_col = "deaths", population_col = "population", meta_analysis = FALSE, save_fig = FALSE, save_csv = FALSE ) ``` The returned object `res` is a named list. Common fields include: ```{r temp-mortality-results} res$data_raw # the input data as loaded res$analysis_results # model coefficients and confidence intervals res$meta_results # pooled estimates (when meta_analysis = TRUE) ``` ### Adding covariates Pass extra column names via `independent_cols` (continuous exposures) and `control_cols` (factors such as day-of-week or public holidays): ```{r temp-mortality-covariates} res <- climatehealth::temp_mortality_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", temperature_col = "tmean", health_outcome_col = "deaths", population_col = "population", independent_cols = c("humidity", "ozone"), control_cols = c("dow", "holiday_flag"), meta_analysis = FALSE, save_fig = FALSE, save_csv = FALSE ) ``` ### Pooling across regions with meta-analysis Set `meta_analysis = TRUE` to pool region-level estimates into a single national estimate: ```{r temp-mortality-meta} res <- climatehealth::temp_mortality_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", temperature_col = "tmean", health_outcome_col = "deaths", population_col = "population", country = "National", meta_analysis = TRUE, save_fig = FALSE, save_csv = FALSE ) ``` --- ## The six indicators ### Air pollution `air_pollution_do_analysis()` estimates attributable mortality burden from PM2.5 exposure. By default it expects columns named `date`, `region`, `pm25`, `deaths`, `population`, `humidity`, `precipitation`, `tmax`, and `wind_speed`. ```{r air-pollution} res <- climatehealth::air_pollution_do_analysis( data_path = "path/to/your/data.csv", save_outputs = FALSE, run_descriptive = TRUE, run_power = TRUE ) ``` Compare against multiple PM2.5 reference thresholds in a single run: ```{r air-pollution-standards} res <- climatehealth::air_pollution_do_analysis( data_path = "path/to/your/data.csv", reference_standards = list( list(value = 15, name = "WHO"), list(value = 25, name = "National") ), save_outputs = FALSE, run_power = TRUE ) ``` ### Wildfires `wildfire_do_analysis()` estimates the health impact of wildfire smoke exposure. ```{r wildfire} res <- climatehealth::wildfire_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", exposure_col = "pm25_fire", health_outcome_col = "respiratory_admissions", population_col = "population", meta_analysis = FALSE, save_fig = FALSE, save_csv = FALSE ) ``` ### Mental health (suicides and heat) `suicides_heat_do_analysis()` models the association between temperature and suicide counts. ```{r suicides} res <- climatehealth::suicides_heat_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", temperature_col = "tmean", health_outcome_col = "suicides", population_col = "population", meta_analysis = FALSE, save_fig = FALSE, save_csv = FALSE ) ``` ### Water-borne diseases (diarrhoea) `diarrhea_do_analysis()` estimates climate-driven diarrhoea burden. ```{r diarrhea} res <- climatehealth::diarrhea_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", temperature_col = "tmean", health_outcome_col = "diarrhea_cases", population_col = "population", meta_analysis = FALSE, save_fig = FALSE, save_csv = FALSE ) ``` ### Vector-borne diseases (malaria) `malaria_do_analysis()` requires the **INLA** package (see Installation above). ```{r malaria} res <- climatehealth::malaria_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", temperature_col = "tmean", health_outcome_col = "malaria_cases", population_col = "population", meta_analysis = FALSE, save_fig = FALSE, save_csv = FALSE ) ``` --- ## Descriptive statistics Before running an indicator analysis, use `run_descriptive_stats()` to explore your data: distributions, correlations, missing values, outliers, and seasonal patterns. ```{r descriptive-basic} df <- read.csv("path/to/your/data.csv") desc <- climatehealth::run_descriptive_stats( data = df, output_path = "path/to/output/folder", aggregation_column = "region", dependent_col = "deaths", independent_cols = c("tmean", "humidity", "rainfall"), plot_corr_matrix = TRUE, plot_dist = TRUE, plot_na_counts = TRUE, plot_scatter = TRUE, plot_box = TRUE, create_base_dir = TRUE ) ``` Add units for cleaner plot labels, and enable time-series and rate calculations: ```{r descriptive-advanced} desc <- climatehealth::run_descriptive_stats( data = df, output_path = "path/to/output/folder", aggregation_column = "region", population_col = "population", dependent_col = "deaths", independent_cols = c("tmean", "humidity", "rainfall"), units = c( deaths = "count", tmean = "C", humidity = "%", rainfall = "mm" ), timeseries_col = "date", plot_corr_matrix = TRUE, plot_dist = TRUE, plot_ma = TRUE, ma_days = 30, plot_seasonal = TRUE, plot_regional = TRUE, plot_total = TRUE, detect_outliers = TRUE, calculate_rate = TRUE, create_base_dir = TRUE ) ``` The returned list includes paths to all generated plots: ```{r descriptive-results} desc$run_output_path # folder where all outputs were saved desc$region_output_paths # per-region output sub-folders ``` --- ## Saving outputs Every indicator function accepts `save_fig` and `save_csv` arguments (or `save_outputs` for air pollution). Set these to `TRUE` and supply `output_folder_path` to write results to disk. The function creates a timestamped sub-folder automatically. ```{r saving} res <- climatehealth::temp_mortality_do_analysis( data_path = "path/to/your/data.csv", date_col = "date", region_col = "region", temperature_col = "tmean", health_outcome_col = "deaths", population_col = "population", meta_analysis = TRUE, save_fig = TRUE, save_csv = TRUE, output_folder_path = "path/to/output/folder" ) ``` --- ## Error handling The package uses structured conditions. You can catch them with `tryCatch`: ```{r error-handling} result <- tryCatch( climatehealth::temp_mortality_do_analysis( data_path = "path/to/your/data.csv", date_col = "wrong_column_name", health_outcome_col = "deaths", population_col = "population" ), climate_error = function(e) { message("climatehealth error: ", conditionMessage(e)) NULL } ) ``` Use `is_climate_error()` to test whether a caught condition came from this package: ```{r is-climate-error} climatehealth::is_climate_error(e) ``` --- ## Next steps - **Full example scripts** for each indicator are in `system.file("examples", package = "climatehealth")`. - **Function reference**: see `?temp_mortality_do_analysis` and related help pages. - **Methodology documents** for each SOSCHI topic are linked from the [SOSCHI project website](https://climate-health.officialstatistics.org). - **Report issues** by emailing or via the [Contact Us page](https://climate-health.officialstatistics.org).