--- title: "Interactive Analysis with the Shiny App" author: "Abhijit Pakhare" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Interactive Analysis with the Shiny App} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## Overview The `stepssurvey` package includes a Shiny web application that provides a guided, point-and-click interface for analysing WHO STEPS survey data. It calls the same functions available in the R API, so results are identical whether you use the app or write scripts. The app is designed for public health professionals who may not write R code but need to produce standardised STEPS outputs. ### Launching the app ```{r launch} library(stepssurvey) run_app() ``` This opens a browser window with a six-tab workflow. Each tab corresponds to one stage of the pipeline: **Data & Settings**, **Clean**, **Design**, **Indicators**, **Visualise**, and **Reports**. Work through the tabs from left to right. Each tab activates once the previous step has completed successfully. ## Tab 1: Data & Settings This is where you load your STEPS data. ### Uploading a data file Click **Browse** and select your file. Supported formats are CSV (`.csv`), Excel (`.xlsx`), Stata (`.dta`), and SPSS (`.sav`). SPSS is the most common format for STEPS data exported from Epi Info. After upload, you will see: - **Raw data preview** -- a scrollable table showing the first 100 rows so you can verify the file loaded correctly. - **Detected columns** -- a table showing which STEPS variables were auto-detected and which column in your data they map to. ### Using demo data If you do not have a STEPS data file, click **Use demo data** to load a realistic simulated dataset with 3,000 observations. This is useful for exploring the app and understanding the outputs before running your own data. ### Column overrides The auto-detection system recognises standard WHO STEPS variable codes from both instrument v3.1 and v3.2. However, some country datasets use non-standard names. If a variable is marked "not found", use the dropdown menus under **Column overrides** to manually select the correct column from your data. Key variables to check: | Variable | What it is | Why it matters | |----------|-----------|----------------| | `age` | Respondent age in years | Required for age group classification | | `sex` | Respondent sex | Required for sex-stratified tables | | `weight_step1` | Sampling weight (Step 1) | Needed for correct weighted estimates | | `psu` | Primary sampling unit / cluster | Needed for correct standard errors | | `tobacco_current` | Current tobacco smoking | Core Step 1 indicator | | `sbp1` | First SBP reading | Core Step 2 indicator | | `fasting_glucose` | Fasting blood glucose | Core Step 3 indicator | ### Survey settings Set the country name, survey year, and target age range (default 18--69 years) in the settings panel. These appear in the report headers and determine which observations are included in the analysis. ## Tab 2: Clean Click the **Clean** button to process the raw data. The cleaning step: - Restricts the sample to the specified age range - Creates WHO standard age groups (18--24, 25--34, ... 65+) - Harmonises sex coding to Male/Female - Derives all binary indicators (raised BP, overweight, etc.) - Computes BMI, mean blood pressure, MET-minutes/week, and other continuous measures After cleaning, you will see: - **Row counts** -- how many observations were retained versus excluded - **Clean data preview** -- a table of the processed data with all derived variables If cleaning fails, check the console output (or R Studio console) for messages about which variables were missing or had unexpected values. ## Tab 3: Design This tab creates the complex survey design object that ensures all estimates account for the sampling weights, stratification, and clustering used in STEPS surveys. The app automatically detects the survey design complexity: - **Full complex design** if weight, stratum, and PSU columns are present - **Weights only** if no stratification or clustering information is found - **Unweighted** if no weight column is present (not recommended) You will see a summary confirming which design was created. Extreme weights are automatically trimmed to prevent individual observations from dominating the estimates. ## Tab 4: Indicators Click **Compute indicators** to calculate all weighted prevalence estimates and means across six domains: - Tobacco use (current, daily, smokeless, second-hand exposure) - Alcohol consumption (current, heavy episodic, mean drinks) - Diet and Physical Activity (fruit/vegetable intake, MET-minutes) - Anthropometry (BMI, waist circumference, overweight/obesity) - Blood Pressure (mean SBP/DBP, raised BP, treatment cascade) - Biochemical (glucose, cholesterol, HDL, triglycerides) After computation, you will see: - **Value boxes** showing the number of indicators computed and key headline figures - **Key indicators table** -- a summary table of all headline prevalences with 95% confidence intervals, downloadable as CSV ## Tab 5: Visualise This tab displays pre-built charts. All plots use the WHO STEPS colour palette and `theme_steps()` styling. ### Available charts - **Overview** -- horizontal bar chart of all key indicators with 95% CIs, sorted by prevalence - **By sex** -- grouped bar charts comparing Men vs Women for tobacco, blood pressure, obesity, and glucose - **By age** -- line charts with confidence bands showing how blood pressure and obesity vary across age groups - **Sex dashboard** -- a combined 2×2 panel of the most important sex-stratified indicators If a particular chart is not available (because the underlying data was missing), a "not available" message is shown instead of an error. ## Tab 6: Reports This is the final step. Click **Generate report** to produce both reports. ### What happens when you click Generate The app runs a pipeline that: 1. Re-cleans the data and rebuilds the survey design 2. Computes all indicators and builds summary tables 3. Computes the full WHO table registry (~60 detailed 3-panel tables) 4. Generates plots 5. Renders both Word documents A status message keeps you informed of progress. The full process typically takes 30--90 seconds depending on sample size. ### Two reports | Report | Button | Contents | |--------|--------|----------| | **Summary Report** | Download Summary Report | Executive summary, narrative by domain, embedded charts, recommendations, methodology | | **Detailed Data Book** | Download Data Book | Complete WHO 3-panel tables (Men \| Women \| Both Sexes) by age group, organised by STEPS step | ### Summary Report The country report includes: - **Key Findings table** listing all headline indicators with 95% CIs - **Domain sections** for Tobacco, Physical Activity, Obesity, Blood Pressure, Blood Glucose, and Cholesterol -- each with an inline prevalence estimate and an embedded chart - **Recommendations** aligned with WHO best-buy interventions - **Methodology** section describing the survey design and analysis ### Detailed Data Book The data book follows the WHO STEPS standard layout with tables organised by survey step: - **Step 1: Behavioural** -- Tobacco (smoking status, smokeless, quit attempts, second-hand exposure), Alcohol (drinking patterns, heavy episodic), Diet (fruit/vegetable, salt), Physical Activity (total minutes, domain breakdown, insufficient PA) - **Step 1.5: Health History** -- BP/glucose/cholesterol screening and diagnosis cascades, CVD history, lifestyle advice - **Step 2: Physical Measurements** -- Mean BP, raised BP, treatment cascade, BMI classifications, waist/hip measurements - **Step 3: Biochemical** -- Fasting glucose (impaired + raised), treatment cascade, total cholesterol, HDL, triglycerides - **Combined Risk Factors** -- summary of 0, 1--2, and 3--5 concurrent risk factors Every table uses the 3-panel format: ``` Age Group | Men | Women | Both Sexes 18-24 | 4.3% (2-6.5)| 0% (0-0) | 1.8% (0.8-2.8) 25-34 | ... | ... | ... ... 18-69 | ... | ... | ... ``` ### Additional downloads - **Download tables & plots** -- a ZIP file containing all pre-computed RDS files (indicators, tables, plots) for further analysis in R ## Tips and troubleshooting ### The app shows an error on upload If you see `dim<-.haven_labelled() not supported`, this was a known issue with SPSS files that has been fixed. Make sure you are running the latest version of the package (`devtools::load_all(".")` or reinstall from GitHub). ### Charts don't display or show "figure margins too large" This happens when the RStudio Viewer pane is too narrow. Open the app in a full browser window instead: after `run_app()`, click "Open in Browser" in the Viewer toolbar. ### Physical Activity shows "not available" The Summary Report derives the insufficient PA indicator from MET-minutes/week. If your dataset has raw GPAQ items (P1--P16) but no pre-computed `met_total`, the package now computes MET-minutes/week automatically using WHO MET multipliers (8 for vigorous, 4 for moderate/transport activities). Ensure your GPAQ columns are correctly mapped in Tab 1. ### Some tables are empty in the Data Book Tables are only produced when the required variables are present in the data. For example, CVD risk tables require both clinical measurements and a risk scoring algorithm. Missing tables show "No data available for this section." ### Can I customise the report templates? Yes. The R Markdown templates are located in `inst/rmd/` within the package source: - `country_report.Rmd` -- Summary Report - `data_book.Rmd` -- Detailed Data Book - `fact_sheet.Rmd` -- Fact Sheet Copy the template you want to modify, edit it, and pass the path via the config object. ## Comparison: Shiny app vs R scripting | Feature | Shiny App | R Scripts | |---------|-----------|-----------| | Audience | Non-coders, quick analysis | R users, reproducible workflows | | Column mapping | Interactive dropdowns | `detect_steps_columns()` + manual overrides | | Customisation | Limited to built-in options | Full control over every step | | Batch processing | One dataset at a time | Loop over multiple datasets | | Reproducibility | Manual (same clicks each time) | Full script = full reproducibility | | Output | Word reports via browser | Word, HTML, PDF, or custom formats | For routine country-level STEPS analysis, the Shiny app is the fastest path. For multi-country comparisons, methodological research, or custom indicators, use the R API documented in `vignette("stepssurvey-guide")`.