Sensitivity analysis quantifies how changes in an input affect the model output. In risk analysis, it can be used to identify which uncertain inputs are most important for a decision-relevant output. This helps to understand the model behaviour before running it with real data, to prioritise where to intervene to reduce risk, and to increase efforts to reduce uncertainty in the most sensitive inputs.
In mcmodule, you can perform sensitivity analysis using
different methods:
mcmodule, this method is currently
limited to one variate.Scenario analysis (or “what-if analysis”) is a non-statistical but still useful approach to assess input sensitivity. It does not produce sensitivity indices, but it will show how the output changes when you make specific changes to the inputs. This makes it a practical way to compare targeted input changes using real input data. It is especially helpful for complex models where the most important inputs depend on context (for example, when estimating disease-entry risk across countries, where certain pathways will be more important in some countries than in others). Scenario analysis can therefore help you identify the most sensitive inputs for a particular use case. This topic is covered in the working with what-if scenarios section.
mcmodule_corr() is a screening method.
It ranks inputs by their correlation with a chosen output across the
current model runs (an extension of mc2d::tornado() for
mcmodule objects and multiple variates). It can quickly
highlight influential inputs, but results must be interpreted with care.
A weak correlation does not necessarily mean an input is unimportant
(e.g., symmetric nonlinear effects can cancel out), and a strong
correlation may reflect confounding with other correlated inputs rather
than a uniquely attributable effect.
Unlike methods that require a sample design, this approach works
directly on an mcmodule built from model input data. Note
that mcmodule_corr() only computes correlations for
uncertain inputs (stochastic nodes built with func) and,
when variates_as_nsv = TRUE, also for multi-variate
inputs.
First, compute correlations (Spearman by default) and inspect the table.
# Calculate output (expected number of animals not detected)
imports_mcmodule_corr <- trial_totals(
imports_mcmodule,
mc_names = c("no_detect"),
trials_n = "animals_n",
subsets_n = "farms_n",
subsets_p = "h_prev",
mctable = imports_mctable
)
# Calculate correlations using Spearman's method (default)
corr_results <- mcmodule_corr(imports_mcmodule_corr, output = "no_detect_set_n", method = "spearman")
#>
#> === Correlation Analysis Summary ===
#>
#> Analysis Parameters:
#> - Analysis type: Global output
#> - Output node: no_detect_set_n
#> - Correlation method(s): spearman
#> - Missing value handling: all.obs
#>
#> Expression Information:
#> - Module: mcmodule
#> - Expression: imports
#> - Input nodes: w_prev, test_sensi, animals_n, h_prev
#> - Variates analyzed: 6
#>
#> Results Summary:
#> - Total correlations calculated: 24
#> - Top 4 most influential inputs (by absolute mean correlation):
#> 1. h_prev: 0.7286
#> 2. w_prev: 0.4552
#> 3. animals_n: 0.2987
#> 4. test_sensi: -0.1525
#>
#> Input Correlation Strength Distribution:
#> - Very strong: 2 (8.3%)
#> - Strong: 4 (16.7%)
#> - Moderate: 8 (33.3%)
#> - Weak: 4 (16.7%)
#> - Very weak/None: 6 (25.0%)
#>
#> Inputs by Correlation Strength:
#> - Very strong: h_prev
#> - Strong: w_prev, h_prev, test_sensi
#> - Moderate: animals_n, h_prev, w_prev
#> - Weak: w_prev, animals_n, test_sensi
#> - Very weak/None: test_sensi, animals_nIn this example, h_prev is the main driver of
no_detect_set_n (very strong positive association).
w_prev and animals_n also show positive
correlation, while test_sensi is close to zero or slightly
negative across variates.
# View correlation results
head(corr_results)
#> exp exp_n variate output input value method use
#> 1 imports 1 1 no_detect_set_n w_prev 0.684520665 spearman all.obs
#> 2 imports 1 1 no_detect_set_n test_sensi 0.015206219 spearman all.obs
#> 3 imports 1 1 no_detect_set_n animals_n 0.479727721 spearman all.obs
#> 4 imports 1 1 no_detect_set_n h_prev 0.516945462 spearman all.obs
#> 5 imports 1 2 no_detect_set_n w_prev 0.312842343 spearman all.obs
#> 6 imports 1 2 no_detect_set_n test_sensi 0.002070648 spearman all.obs
#> module pathogen origin strength
#> 1 mcmodule a nord Strong
#> 2 mcmodule a nord Very weak/None
#> 3 mcmodule a nord Moderate
#> 4 mcmodule a nord Moderate
#> 5 mcmodule a south Weak
#> 6 mcmodule a south Very weak/NoneThen visualise the ranking with a tornado plot. It can be customised
with standard ggplot2 layers.
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.5.3
# Plot tornado plot of correlations
mcmodule_tornado(imports_mcmodule_corr, output = "no_detect_set_n", method = "spearman", print_summary = FALSE)Each point in the plot represents one input variate. The coloured point marks the variate with the largest absolute correlation for that input, and that value is used to rank the inputs. The black point shows the median correlation across variates, which is often a useful summary when correlations vary across the model. The black horizontal line shows the full range of correlations across variates, so you can see whether an input is consistently influential or only influential in certain variates.
Note that farms_n and test_origin are not
shown in this analysis because they are fixed in the original model data
and therefore have zero correlation with the output. This is an
important limitation of correlation-based methods using real input data:
they can only identify influential inputs that vary in the original
model runs. To explore uncertainty for fixed inputs, you need a method
that evaluates a sample design.
An input sample space is the allowable range of values it can take (or distribution definition). A sample design is a specific set of sampled input combinations used to run the model. The sample space defines where values can come from, while the sample design defines which values are actually used in model runs.
A sample_design in mcmodule can be
either:
sensitivity::sensitivity
package (for example, a Morris design object) that has an X
element containing the sampling matrix.The sampling matrix must have one column per input node and one row per sample. Each row is one scenario in the mcnode uncertainty dimension. When you provide a sample_design, that uncertainty dimension is no longer sampled from the original model inputs; it is mapped to the values in your design. Currently, sample_design supports one variate.
You can pass the sampling matrix (or sensitivity object) directly to
eval_module(), or set it as a global default with
set_sample_design(). Using a global design ensures the same
sampling matrix is available across all model steps, which is convenient
when you run sensitivity analysis after building a complex model. You
can also pass the design to each function call, but that is more
verbose, and you must remove it later to run the model with the original
data inputs.
# Create a sample design as a data frame
X <- data.frame(
w_prev = runif(10000, 0.15, 0.6),
h_prev = runif(10000, 0.02, 0.7),
test_sensi = runif(10000, 0.8, 0.91),
animals_n = sample(5:10, 10000, replace = TRUE),
farms_n = sample(82:176, 10000, replace = TRUE),
test_origin = runif(10000, 0, 1)
)
dim(X)
#> [1] 10000 6
head(X)
#> w_prev h_prev test_sensi animals_n farms_n test_origin
#> 1 0.3340396 0.6117728 0.8477135 7 129 0.44193520
#> 2 0.5473578 0.2434982 0.8176573 6 163 0.03386385
#> 3 0.5732103 0.1054768 0.8905329 6 157 0.63026555
#> 4 0.1705004 0.2622306 0.8228900 10 133 0.15496632
#> 5 0.3876475 0.6528754 0.8313463 8 117 0.66649531
#> 6 0.5515886 0.6151214 0.8600830 8 109 0.82409581
# Set global sample design
set_sample_design(X)
#> sample_design set to X
# Retrieve current global sample design
current_design <- set_sample_design()
# Evaluate the module with the global sample design
imports_mcmodule_X <- eval_module(
exp = imports_exp
)
#> imports_exp evaluated
#> mcmodule created (expressions: imports_exp)
# Calculate output (expected number of animals not detected)
imports_mcmodule_X <- trial_totals(
imports_mcmodule_X,
mc_names = c("no_detect"),
trials_n = "animals_n",
subsets_n = "farms_n",
subsets_p = "h_prev"
)
mcmodule_tornado(imports_mcmodule_X, output = "no_detect_set_n", method = "spearman")
#>
#> === Correlation Analysis Summary ===
#>
#> Analysis Parameters:
#> - Analysis type: Global output
#> - Output node: no_detect_set_n
#> - Correlation method(s): spearman
#> - Missing value handling: all.obs
#>
#> Expression Information:
#> - Module: mcmodule
#> - Expression: imports_exp
#> - Input nodes: w_prev, test_origin, test_sensi, animals_n, farms_n, h_prev
#> - Variates analyzed: 1
#>
#> Results Summary:
#> - Total correlations calculated: 6
#> - Top 5 most influential inputs (by absolute mean correlation):
#> 1. h_prev: 0.6786
#> 2. test_origin: -0.4723
#> 3. w_prev: 0.3440
#> 4. animals_n: 0.2187
#> 5. farms_n: 0.2075
#>
#> Input Correlation Strength Distribution:
#> - Very strong: 0 (0.0%)
#> - Strong: 1 (16.7%)
#> - Moderate: 1 (16.7%)
#> - Weak: 3 (50.0%)
#> - Very weak/None: 1 (16.7%)
#>
#> Inputs by Correlation Strength:
#> - Strong: h_prev
#> - Moderate: test_origin
#> - Weak: w_prev, animals_n, farms_n
#> - Very weak/None: test_sensiThis sample design produces a different ranking from the first
tornado analysis, because we’re using the inputs sampling space rather
than the original model inputs. Here, h_prev is the main
positive driver of no_detect_set_n.
test_origin is the next most influential input and is
negatively associated with the output. w_prev,
animals_n and farms_n have smaller positive
effects, while test_sensi is weak or negligible within
these bounds.
mctableFor methods that use a sample design, mctable can be
used to define how inputs are sampled. In particular,
mctable$sample_space stores the bounds or distribution
parameters used to generate sensitivity samples. To read more about
mctable see the package
vignette.
imports_mctable[,c("mcnode","mc_func", "sample_space")]
#> mcnode mc_func sample_space
#> 1 h_prev runif min = 0.02, max = 0.7
#> 2 w_prev runif min = 0.15, max = 0.6
#> 3 test_sensi rpert min = 0.8, mode = 0.875, max = 0.91
#> 4 farms_n <NA> min = 5, max = 10
#> 5 animals_n rnorm min = 82, max = 176
#> 6 test_origin_unk <NA> <NA>
#> 7 test_origin <NA> min = 0, max = 1Two helper functions support common sampling designs from
mctable:
mctable_bounds() for bound (min and
max) extraction, for Morris and related approaches.mctable_sobol_matrices() for Sobol design
matrices.Currently, mcmodule only provides direct conversion from
mctable to Sobol sampling designs. However, you can use
mctable_bounds() to create many other sample designs using
different approaches. The only requirement is that your sample design
matrix contains columns for all the input nodes in your
mcmodule model.
Morris is another screening method. It identifies influential inputs at moderate computational cost and highlights potential nonlinearity or interactions.
Start by getting parameter bounds from mctable.
# Get bounds for Morris sampling design
b <- mctable_bounds(imports_mctable, transformation = FALSE)Build the Morris design with sensitivity::morris(). The
resulting object can be passed directly as sample_design.
The arguments factors, r, and
design control the the factors (inputs) sampled, the number
of repetitions of the design, and the step size between sampled values,
respectively. The arguments binf and bsup
provide the bounds for each input, and scale = TRUE scales
the inputs to [0,1] for better comparability of Morris indices across
inputs with different ranges.
To have stable Morris indices in large sample designs (with many inputs), you need to run the model with a large number of repetitions, which can be computationally intensive. Always check the sensitivity indices stability in diferent runs before interpreting them, increase r until rankings stabilise.
library(sensitivity)
#> Warning: package 'sensitivity' was built under R version 4.5.2
#> Registered S3 method overwritten by 'sensitivity':
#> method from
#> print.src dplyr
#>
#> Attaching package: 'sensitivity'
#> The following object is masked from 'package:dplyr':
#>
#> src
# Create Morris design
morris_sa <- sensitivity::morris(
model = NULL,
factors = b$factors,
r = 2000,
design = list(type = "oat", levels = 4, grid.jump = 2),
binf = b$binf,
bsup = b$bsup,
scale = TRUE
)
#> Warning in sensitivity::morris(model = NULL, factors = b$factors, r = 2000, :
#> keeping 1999 repetitions out of 2000
# Evaluate the module with that design.
imports_mcmodule_morris <- eval_module(
exp = imports_exp,
sample_design = morris_sa,
mctable = imports_mctable
)
#> imports_exp evaluated
#> mcmodule created (expressions: imports_exp)
# Calculate output (expected number of animals not detected)
imports_mcmodule_morris <- trial_totals(
imports_mcmodule_morris,
mc_names = c("no_detect"),
trials_n = "animals_n",
subsets_n = "farms_n",
subsets_p = "h_prev",
sample_design = morris_sa,
mctable = imports_mctable
)Finally, extract the output vector and estimate Morris indices. The
tell() function updates the Morris object with the model
output, and then you can print and plot the indices.
# Extract the output vector and estimate Morris indices.
y <- unmc(imports_mcmodule_morris$node_list$no_detect_set_n$mcnode)
# Aggregated output used for sensitivity analysis.
sensitivity::tell(morris_sa, y)
# Print Morris indices
morris_sa
#>
#> Call:
#> sensitivity::morris(model = NULL, factors = b$factors, r = 2000, design = list(type = "oat", levels = 4, grid.jump = 2), binf = b$binf, bsup = b$bsup, scale = TRUE)
#>
#> Model runs: 13993
#> mu mu.star sigma
#> h_prev 139.808877 139.808877 124.93784
#> w_prev 89.096448 89.096448 98.97369
#> test_sensi -7.335186 7.335186 10.79865
#> farms_n 50.032335 50.032335 61.29715
#> animals_n 55.904818 55.904818 69.23723
#> test_origin -111.438795 111.438795 110.73603
# Plot Morris indices (mu.star and sigma)
plot(morris_sa)In the Morris method, mu ( \(\mu\) ) is the mean of the elementary
effects across trajectories and mu.star ( \(\mu^*\) ) is the absolute value of
mu, which summarises each input’s overall importance
(larger values indicate a stronger average influence on the output,
regardless of direction). h_prev has the largest
mu.star, making it the most influential input overall,
followed by test_origin. w_prev is the next
most important, while farms_n and animals_n
have more moderate effects, and test_sensi is negligible
within the explored bounds (note that the uncertainty is small,
0.875-0.91). sigma ( \(\sigma\) ) is the standard deviation of the
elementary effects. Large sigma values indicate that the
effect of an input varies substantially across the sampled space, which
typically reflects nonlinearity and/or interactions with other
inputs.
Sobol analysis decomposes output variance into first-order and higher-order effects, providing a global sensitivity analysis. It is more computationally intensive than Morris, but it gives a more complete picture of input importance, including interactions. Sobol variance decomposition assumes independent inputs, if inputs are dependent, interpret Sobol indices with care or consider other approaches.
First, generate Sobol sampling matrices from
mctable.
mctable_sobol_matrices() creates Sobol design matrices
from your mctable definitions.
Evaluate the module, compute the target output as before.
imports_mcmodule_sobol <- eval_module(
exp = imports_exp,
data = NULL,
sample_design = X,
mctable = imports_mctable
)
#> imports_exp evaluated
#> mcmodule created (expressions: imports_exp)
imports_mcmodule_sobol <- trial_totals(
mcmodule = imports_mcmodule_sobol,
mc_names = c("no_detect"),
trials_n = "animals_n",
subsets_n = "farms_n",
subsets_p = "h_prev",
sample_design = X,
mctable = imports_mctable
)Then compute first-order (Si) and total-order
(Ti) Sobol indices and print and plot the results.
y <- unmc(imports_mcmodule_sobol$node_list$no_detect_set_n$mcnode)
# Compute Sobol indices
sobol_sa <- sensobol::sobol_indices(Y = y, N = N, params = colnames(X), order = "second", boot = TRUE, R = 1000)
sobol_sa
#>
#> First-order estimator: saltelli | Total-order estimator: jansen
#>
#> Total number of model runs: 230000
#>
#> Sum of first order indices: 0.7854101
#> original bias std.error low.ci high.ci
#> <num> <num> <num> <num> <num>
#> 1: 3.332841e-01 2.806764e-04 1.504341e-02 0.3035188778 0.3624879705
#> 2: 1.377370e-01 1.157053e-04 1.010515e-02 0.1178155716 0.1574270136
#> 3: 3.909247e-04 8.930736e-06 5.250725e-04 -0.0006471292 0.0014111171
#> 4: 4.305921e-02 -2.409696e-04 5.966784e-03 0.0316054992 0.0549948618
#> 5: 4.684154e-02 2.136317e-04 6.030061e-03 0.0348092104 0.0584466142
#> 6: 2.240974e-01 -1.665205e-04 1.286802e-02 0.1990430331 0.2494847426
#> 7: 4.907211e-01 5.028705e-04 1.220140e-02 0.4663039437 0.5141325541
#> 8: 2.289120e-01 1.739086e-04 6.448966e-03 0.2160983631 0.2413778446
#> 9: 7.520003e-04 1.397912e-06 3.137228e-05 0.0006891139 0.0008120909
#> 10: 7.602997e-02 7.511655e-05 2.189468e-03 0.0716635768 0.0802461326
#> 11: 9.089470e-02 2.177806e-05 2.819738e-03 0.0853463316 0.0963995033
#> 12: 3.490372e-01 1.310243e-04 9.342170e-03 0.3305958773 0.3672165122
#> 13: 4.242323e-02 -3.018290e-04 7.306883e-03 0.0284038338 0.0570462888
#> 14: 2.276340e-04 -1.191114e-05 4.058860e-04 -0.0005559768 0.0010350670
#> 15: 1.158978e-02 -1.230634e-04 4.097145e-03 0.0036825872 0.0197430993
#> 16: 1.559113e-02 -6.954566e-05 4.275403e-03 0.0072810375 0.0240403091
#> 17: 6.621890e-02 -3.633257e-04 8.412995e-03 0.0500930632 0.0830713962
#> 18: 1.986587e-05 7.138970e-07 2.632880e-04 -0.0004968831 0.0005351870
#> 19: 4.869421e-03 -1.850100e-05 2.874968e-03 -0.0007469115 0.0105227557
#> 20: 5.898327e-03 5.734532e-05 2.902983e-03 0.0001512393 0.0115307250
#> 21: 3.005418e-02 3.219238e-05 6.185212e-03 0.0178991922 0.0421447759
#> 22: -5.842551e-06 -2.641477e-07 1.364565e-04 -0.0002730283 0.0002618715
#> 23: 4.855472e-05 8.737696e-06 1.659606e-04 -0.0002854598 0.0003650939
#> 24: -1.751435e-05 -1.019615e-05 3.833124e-04 -0.0007585966 0.0007439603
#> 25: 2.403328e-03 2.678811e-05 1.661811e-03 -0.0008805490 0.0056336294
#> 26: 7.210102e-03 -1.289130e-04 3.450360e-03 0.0005764332 0.0141015960
#> 27: 8.240894e-03 -3.590833e-05 3.802195e-03 0.0008246376 0.0157289665
#> original bias std.error low.ci high.ci
#> <num> <num> <num> <num> <num>
#> sensitivity parameters
#> <char> <char>
#> 1: Si h_prev
#> 2: Si w_prev
#> 3: Si test_sensi
#> 4: Si farms_n
#> 5: Si animals_n
#> 6: Si test_origin
#> 7: Ti h_prev
#> 8: Ti w_prev
#> 9: Ti test_sensi
#> 10: Ti farms_n
#> 11: Ti animals_n
#> 12: Ti test_origin
#> 13: Sij h_prev.w_prev
#> 14: Sij h_prev.test_sensi
#> 15: Sij h_prev.farms_n
#> 16: Sij h_prev.animals_n
#> 17: Sij h_prev.test_origin
#> 18: Sij w_prev.test_sensi
#> 19: Sij w_prev.farms_n
#> 20: Sij w_prev.animals_n
#> 21: Sij w_prev.test_origin
#> 22: Sij test_sensi.farms_n
#> 23: Sij test_sensi.animals_n
#> 24: Sij test_sensi.test_origin
#> 25: Sij farms_n.animals_n
#> 26: Sij farms_n.test_origin
#> 27: Sij animals_n.test_origin
#> sensitivity parameters
#> <char> <char>
plot(sobol_sa)First-order indices (Si) measure each input’s main
effect, the share of output variance attributable to changing that
input alone, averaged over uncertainty in all other inputs. Larger
Si values indicate the model output is more sensitive to
that factor. Here, h_prev is the dominant driver of
variance, test_origin is the next most influential, and
w_prev also contributes materially; farms_n
and animals_n have smaller but non-zero effects, while
test_sensi is negligible. Negative estimates can occur due
to Monte Carlo error and should be interpreted as ~0 if the confidence
interval overlaps 0. If important factors have wide CIs or many
negatives appear, increase N (and/or bootstrap replicates) before
interpreting small effects.
Total-order indices (Ti) measure each input’s total
effect on output variance, including its main effect plus all
interaction effects with other inputs; Ti is therefore
always at least as large as Si for the same parameter.
Second-order indices (Sij) quantify interaction effects:
how much variance is explained only by varying two inputs
together beyond what their individual first-order effects
(Si) explain. Values near 0 indicate little interaction;
larger values indicate the combined effect of the pair is important
(e.g., h_prev.test_origin suggests a meaningful interaction
between h_prev and test_origin).
You can use other sensitivity methods with mcmodule as
long as you can build a table/matrix of input values to run the model
with, and collect the corresponding model output(s). For example,
sensitivity::fast99() can estimate first-order
variance-based indices with structured sampling and can be cheaper than
second-order Sobol, depending on the design and number of factors. If
you want a more familiar “risk-factor style” summary, regression/ANOVA
approaches can quantify how the output changes as inputs change. Tools
such as Shapley values (iml::shapley()) can help interpret
complex, nonlinear relationships, but they answer a slightly different
question and can be computationally heavy. The best method will depend
on your model, computational resources, and the specific questions you
want to answer about input importance.