Title: | Targets for JAGS Pipelines |
Description: | Bayesian data analysis usually incurs long runtimes and cumbersome custom code. A pipeline toolkit tailored to Bayesian statisticians, the 'jagstargets' R package is leverages 'targets' and 'R2jags' to ease this burden. 'jagstargets' makes it super easy to set up scalable JAGS pipelines that automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom code is required, and there is no need to manually configure branching, so usage is much easier than 'targets' alone. For the underlying methodology, please refer to the documentation of 'targets' <doi:10.21105/joss.02959> and 'JAGS' (Plummer 2003) https://www.r-project.org/conferences/DSC-2003/Proceedings/Plummer.pdf. |
Version: | 1.2.2 |
License: | MIT + file LICENSE |
URL: | https://docs.ropensci.org/jagstargets/, https://github.com/ropensci/jagstargets |
BugReports: | https://github.com/ropensci/jagstargets/issues |
Depends: | R (≥ 3.5.0) |
Imports: | coda (≥ 0.19.4), fst (≥ 0.9.2), posterior (≥ 1.0.1), purrr (≥ 0.3.4), qs2, R2jags (≥ 0.6.1), rjags (≥ 4.10), rlang (≥ 0.4.10), secretbase (≥ 0.4.0), stats, targets (≥ 1.6.0), tarchetypes (≥ 0.8.0), tibble (≥ 3.0.1), tidyselect, tools, utils, withr (≥ 2.1.2), |
Suggests: | dplyr (≥ 1.0.2), fs (≥ 1.5.0), knitr (≥ 1.30), qs (≥ 0.23.2), R.utils (≥ 2.10.1), rmarkdown (≥ 2.3), testthat (≥ 3.0.0), tidyr (≥ 1.1.2), visNetwork (≥ 2.0.9) |
SystemRequirements: | JAGS 4.x.y (https://mcmc-jags.sourceforge.net) |
Encoding: | UTF-8 |
Language: | en-US |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2024-11-18 14:11:44 UTC; C240390 |
Author: | William Michael Landau
|
Maintainer: | William Michael Landau <will.landau.oss@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-11-18 14:50:02 UTC |
jagstargets: Targets for JAGS Workflows
Description
Bayesian data analysis usually incurs long runtimes
and cumbersome custom code. A pipeline toolkit tailored to
Bayesian statisticians, the jagstargets
R package leverages
targets
and R2jags
to ease this burden.
jagstargets
makes it super easy to set up scalable
JAGS pipelines that automatically parallelize the computation
and skip expensive steps when the results are already up to date.
Minimal custom code is required, and there is no need to manually
configure branching, so usage is much easier than targets
alone.
See Also
https://docs.ropensci.org/jagstargets/, tar_jags()
One MCMC per model with multiple outputs
Description
Targets to run a JAGS model once with MCMC and save multiple outputs.
Usage
tar_jags(
name,
jags_files,
parameters.to.save,
data = list(),
summaries = list(),
summary_args = list(),
n.cluster = 1,
n.chains = 3,
n.iter = 2000,
n.burnin = as.integer(n.iter/2),
n.thin = 1,
jags.module = c("glm", "dic"),
inits = NULL,
RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"),
jags.seed = 1,
stdout = NULL,
stderr = NULL,
progress.bar = "text",
refresh = 0,
draws = TRUE,
summary = TRUE,
dic = TRUE,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
library = targets::tar_option_get("library"),
format = "qs",
format_df = "fst_tbl",
repository = targets::tar_option_get("repository"),
error = targets::tar_option_get("error"),
memory = targets::tar_option_get("memory"),
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")
)
Arguments
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
summaries |
List of summary functions passed to |
summary_args |
List of summary function arguments passed to
|
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
Seeds to apply to JAGS, passed to
|
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
draws |
Logical, whether to create a target for posterior draws.
Saves draws as a compressed |
summary |
Logical, whether to create a target to store a small data frame of posterior summary statistics and convergence diagnostics. |
dic |
Logical, whether to create a target with deviance information criterion (DIC) results. |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the non-data-frame
targets such as the JAGS data and any JAGS fit objects.
Please choose an all=purpose
format such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Details
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
Value
tar_jags()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags(name = x, jags_files = "y.jags", ...)
returns a list
of targets::tar_target()
objects:
-
x_file_y
: reproducibly track the JAGS model file. Returns a character vector of length 1 with the path to the JAGS model file. -
x_lines_y
: read the contents of the JAGS model file for safe transport to parallel workers. Returns a character vector of lines in the model file. -
x_data
: run the R expression in thedata
argument to produce a JAGS dataset for the model. Returns a JAGS data list. -
x_mcmc_y
: run MCMC on the model and dataset. Returns anrjags
object fromR2jags
with all the MCMC results. -
x_draws_y
: extract posterior samples fromx_mcmc_y
. Returns a tidy data frame of MCMC draws. Omitted ifdraws = FALSE
. -
x_summary_y
: extract posterior summaries fromx_mcmc_y
. Returns a tidy data frame of MCMC draws. Omitted ifsummary = FALSE
. -
x_dic
: extract deviance information criterion (DIC) info fromx_mcmc_y
. Returns a tidy data frame of DIC info. Omitted ifdic = FALSE
.
Target objects
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Examples
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) {
targets::tar_dir({ # tar_dir() runs code from a temporary directory.
targets::tar_script({
library(jagstargets)
# Do not use a temp file for a real project
# or else your targets will always rerun.
tmp <- tempfile(pattern = "", fileext = ".jags")
tar_jags_example_file(tmp)
list(
tar_jags(
your_model,
jags_files = tmp,
data = tar_jags_example_data(),
parameters.to.save = "beta",
stdout = R.utils::nullfile(),
stderr = R.utils::nullfile()
)
)
}, ask = FALSE)
targets::tar_make()
})
}
Select a strategic piece of R2jags
output.
Description
Not a user-side function. Do not call directly. Exported for infrastructure reasons only.
Usage
tar_jags_df(
fit,
data,
output = c("draws", "summary", "dic"),
variables = NULL,
summaries = NULL,
summary_args = NULL,
transform = NULL
)
Arguments
fit |
|
data |
A list, the original JAGS dataset. |
output |
Character of length 1 denoting the type of output |
variables |
Character vector of model parameter names. The output posterior summaries are restricted to these variables. |
summaries |
List of summary functions passed to |
summary_args |
List of summary function arguments passed to
|
transform |
Symbol or |
Value
A data frame of R2jags
output. Depends on the output
argument.
Simulate example JAGS data.
Description
An example dataset compatible with the model file
from tar_jags_example_file()
. The output has a .join_data
element so the true value of beta
from the simulation
is automatically appended to the beta
rows of the
summary output.
Usage
tar_jags_example_data(n = 10L)
Arguments
n |
Integer of length 1, number of data points. |
Format
A list with the following elements:
-
n
: integer, number of data points. -
x
: numeric, covariate vector. -
y
: numeric, response variable. -
true_beta
: numeric of length 1, value of the regression coefficientbeta
used in simulation. -
.join_data
: a list of simulated values to be appended to as a.join_data
column in the output of targets generated by functions such astar_jags_rep_summary()
. Contains the regression coefficientbeta
(numeric of length 1) and prior predictive datay
(numeric vector).
Details
The tar_jags_example_data()
function draws a JAGS
dataset from the prior predictive distribution of the
model from tar_jags_example_file()
. First, the
regression coefficient beta
is drawn from its standard
normal prior, and the covariate x
is computed.
Then, conditional on the beta
draws and the covariate,
the response vector y
is drawn from its
Normal(x * beta
, 1) likelihood.
Value
List, dataset compatible with the model file from
tar_jags_example_file()
. The output has a .join_data
element so the true value of beta
from the simulation
is automatically appended to the beta
rows of the
summary output.
Examples
tar_jags_example_data()
Write an example JAGS model file.
Description
Overwrites the file at path
with a built-in example
JAGS model file.
Usage
tar_jags_example_file(path = tempfile(pattern = "", fileext = ".jags"))
Arguments
path |
Character of length 1, file path to write the model file. |
Value
NULL
(invisibly).
Examples
path <- tempfile(pattern = "", fileext = ".jags")
tar_jags_example_file(path = path)
writeLines(readLines(path))
Tidy output from multiple MCMCs per model.
Description
Internal function. Users should not invoke directly.
Usage
tar_jags_rep(
name,
jags_files,
parameters.to.save,
data = quote(list()),
batches = 1L,
reps = 1L,
output = c("summary", "draws", "dic"),
variables = NULL,
summaries = NULL,
summary_args = NULL,
transform = NULL,
combine = TRUE,
n.cluster = 1,
n.chains = 3,
n.iter = 2000,
n.burnin = as.integer(n.iter/2),
n.thin = 1,
jags.module = c("glm", "dic"),
inits = NULL,
RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"),
jags.seed = NULL,
stdout = NULL,
stderr = NULL,
progress.bar = "text",
refresh = 0,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
library = targets::tar_option_get("library"),
format = "qs",
format_df = "fst_tbl",
repository = targets::tar_option_get("repository"),
error = targets::tar_option_get("error"),
memory = targets::tar_option_get("memory"),
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")
)
Arguments
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
batches |
Number of batches. Each batch runs a model |
reps |
Number of replications per batch. Ideally, each rep
should produce its own random dataset using the code
supplied to |
output |
Character of length 1 denoting the type of output |
variables |
Character vector of model parameter names. The output posterior summaries are restricted to these variables. |
summaries |
List of summary functions passed to |
summary_args |
List of summary function arguments passed to
|
transform |
Symbol or |
combine |
Logical, whether to create a target to combine all the model results into a single data frame downstream. Convenient, but duplicates data. |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
The |
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the data frames
of posterior summaries and other data frames returned by targets.
We recommend efficient data frame formats
such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Value
tar_jags_rep(name = x, jags_files = "y.jags")
returns a list of targets::tar_target()
objects:
-
x_file_y
: reproducibly track the jags model file. -
x_lines_y
: contents of the jags model file. -
x_data
: dynamic branching target with simulated datasets. -
x_y
: dynamic branching target with tidy data frames of MCMC summaries. -
x
: combine all the model-specific summary targets into a single data frame with columns to distinguish among the models. Suppressed ifcombine
isFALSE
.
Seeds
Rep-specific random number generator seeds for the data and models
are automatically set based on the batch, rep,
parent target name, and tar_option_get("seed")
. This ensures
the rep-specific seeds do not change when you change the batching
configuration (e.g. 40 batches of 10 reps each vs 20 batches of 20
reps each). Each data seed is in the .seed
list element of the output,
and each JAGS seed is in the .seed column of each JAGS model output.
Generate a batch of data
Description
Not a user-side function. Do not invoke directly.
Usage
tar_jags_rep_data_batch(reps, batch, command)
Arguments
reps |
Positive integer of length 1, number of reps to run. |
batch |
Positive integer of length 1, index of the current batch. |
command |
R code to run to generate one dataset. |
Value
A list of JAGS datasets containing data and dataset IDs.
Examples
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) {
tar_jags_rep_data_batch(2, 1, tar_jags_example_data())
}
Tidy DIC output from multiple MCMCs per model
Description
Run multiple MCMCs on simulated datasets and return DIC and the effective number of parameters for each run.
Usage
tar_jags_rep_dic(
name,
jags_files,
parameters.to.save,
data = list(),
batches = 1L,
reps = 1L,
combine = TRUE,
n.cluster = 1,
n.chains = 3,
n.iter = 2000,
n.burnin = as.integer(n.iter/2),
n.thin = 1,
jags.module = c("glm", "dic"),
inits = NULL,
RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"),
jags.seed = NULL,
stdout = NULL,
stderr = NULL,
progress.bar = "text",
refresh = 0,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
library = targets::tar_option_get("library"),
format = "qs",
format_df = "fst_tbl",
repository = targets::tar_option_get("repository"),
error = targets::tar_option_get("error"),
memory = targets::tar_option_get("memory"),
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")
)
Arguments
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
batches |
Number of batches. Each batch runs a model |
reps |
Number of replications per batch. Ideally, each rep
should produce its own random dataset using the code
supplied to |
combine |
Logical, whether to create a target to combine all the model results into a single data frame downstream. Convenient, but duplicates data. |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
The |
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the data frames
of posterior summaries and other data frames returned by targets.
We recommend efficient data frame formats
such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Details
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
Value
tar_jags_rep_dic()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags_rep_dic(name = x, jags_files = "y.jags")
are as follows.
-
x_file_y
: reproducibly track the JAGS model file. Returns a character vector of length 1 with the path to the JAGS model file. -
x_lines_y
: read the contents of the JAGS model file for safe transport to parallel workers. Returns a character vector of lines in the model file. -
x_data
: use dynamic branching to generate multiple JAGS datasets from the R expression in thedata
argument. Each dynamic branch returns a batch of JAGS data lists. -
x_y
: run JAGS on each dataset fromx_data
. Each dynamic branch returns a tidy data frame of DIC results for each batch of data. -
x
: combine all the batches fromx_y
into a non-dynamic target. Suppressed ifcombine
isFALSE
. Returns a long tidy data frame with all DIC info from all the branches ofx_y
.
Seeds
Rep-specific random number generator seeds for the data and models
are automatically set based on the batch, rep,
parent target name, and tar_option_get("seed")
. This ensures
the rep-specific seeds do not change when you change the batching
configuration (e.g. 40 batches of 10 reps each vs 20 batches of 20
reps each). Each data seed is in the .seed
list element of the output,
and each JAGS seed is in the .seed column of each JAGS model output.
Target objects
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Examples
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) {
targets::tar_dir({ # tar_dir() runs code from a temporary directory.
targets::tar_script({
library(jagstargets)
# Do not use a temp file for a real project
# or else your targets will always rerun.
tmp <- tempfile(pattern = "", fileext = ".jags")
tar_jags_example_file(tmp)
list(
tar_jags_rep_dic(
your_model,
jags_files = tmp,
data = tar_jags_example_data(),
parameters.to.save = "beta",
batches = 2,
reps = 2,
stdout = R.utils::nullfile(),
stderr = R.utils::nullfile()
)
)
}, ask = FALSE)
targets::tar_make()
})
}
Tidy posterior draws from multiple MCMCs per model
Description
Run multiple MCMCs on simulated datasets and return posterior samples and the effective number of parameters for each run.
Usage
tar_jags_rep_draws(
name,
jags_files,
parameters.to.save,
data = list(),
batches = 1L,
reps = 1L,
transform = NULL,
combine = FALSE,
n.cluster = 1,
n.chains = 3,
n.iter = 2000,
n.burnin = as.integer(n.iter/2),
n.thin = 1,
jags.module = c("glm", "dic"),
inits = NULL,
RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"),
jags.seed = NULL,
stdout = NULL,
stderr = NULL,
progress.bar = "text",
refresh = 0,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
library = targets::tar_option_get("library"),
format = "qs",
format_df = "fst_tbl",
repository = targets::tar_option_get("repository"),
error = targets::tar_option_get("error"),
memory = "transient",
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")
)
Arguments
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
batches |
Number of batches. Each batch runs a model |
reps |
Number of replications per batch. Ideally, each rep
should produce its own random dataset using the code
supplied to |
transform |
Symbol or |
combine |
Logical, whether to create a target to combine all the model results into a single data frame downstream. Convenient, but duplicates data. |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
The |
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the data frames
of posterior summaries and other data frames returned by targets.
We recommend efficient data frame formats
such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Details
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
Value
tar_jags_rep_draws()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags_rep_dic(name = x, jags_files = "y.jags")
are as follows.
-
x_file_y
: reproducibly track the JAGS model file. Returns a character vector of length 1 with the path to the JAGS model file. -
x_lines_y
: read the contents of the JAGS model file for safe transport to parallel workers. Returns a character vector of lines in the model file. -
x_data
: use dynamic branching to generate multiple JAGS datasets from the R expression in thedata
argument. Each dynamic branch returns a batch of JAGS data lists. -
x_y
: run JAGS on each dataset fromx_data
. Each dynamic branch returns a tidy data frame of draws for each batch of data. -
x
: combine all the batches fromx_y
into a non-dynamic target. Suppressed ifcombine
isFALSE
. Returns a long tidy data frame with all draws from all the branches ofx_y
.
Seeds
Rep-specific random number generator seeds for the data and models
are automatically set based on the batch, rep,
parent target name, and tar_option_get("seed")
. This ensures
the rep-specific seeds do not change when you change the batching
configuration (e.g. 40 batches of 10 reps each vs 20 batches of 20
reps each). Each data seed is in the .seed
list element of the output,
and each JAGS seed is in the .seed column of each JAGS model output.
Target objects
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Examples
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) {
targets::tar_dir({ # tar_dir() runs code from a temporary directory.
targets::tar_script({
library(jagstargets)
# Do not use a temp file for a real project
# or else your targets will always rerun.
tmp <- tempfile(pattern = "", fileext = ".jags")
tar_jags_example_file(tmp)
list(
tar_jags_rep_draws(
your_model,
jags_files = tmp,
data = tar_jags_example_data(),
parameters.to.save = "beta",
batches = 2,
reps = 2,
stdout = R.utils::nullfile(),
stderr = R.utils::nullfile()
)
)
}, ask = FALSE)
targets::tar_make()
})
}
Run a batch of iterations for a jags model and return only strategic output.
Description
Not a user-side function. Do not invoke directly.
Usage
tar_jags_rep_run(
jags_lines,
jags_name,
jags_file,
parameters.to.save,
data,
variables = NULL,
summaries = NULL,
summary_args = NULL,
transform = NULL,
reps,
output,
n.cluster = n.cluster,
n.chains = n.chains,
n.iter = n.iter,
n.burnin = n.burnin,
n.thin = n.thin,
jags.module = jags.module,
inits = inits,
RNGname = RNGname,
stdout = stdout,
stderr = stderr,
progress.bar = progress.bar,
refresh = refresh
)
Arguments
jags_lines |
Character vector of lines from a JAGS file. |
jags_name |
Friendly suffix of the jags model target. |
jags_file |
Original path to the input jags file. |
parameters.to.save |
Model parameters to save, passed to
|
data |
A list, the original JAGS dataset. |
variables |
Character vector of model parameter names. The output posterior summaries are restricted to these variables. |
summaries |
List of summary functions passed to |
summary_args |
List of summary function arguments passed to
|
transform |
Symbol or |
output |
Character of length 1 denoting the type of output |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
Value
A data frame of posterior summaries.
Tidy posterior summaries from multiple MCMCs per model
Description
Run multiple MCMCs on simulated datasets and return posterior summaries and the effective number of parameters for each run.
Usage
tar_jags_rep_summary(
name,
jags_files,
parameters.to.save,
data = list(),
variables = NULL,
summaries = NULL,
summary_args = NULL,
batches = 1L,
reps = 1L,
combine = TRUE,
n.cluster = 1,
n.chains = 3,
n.iter = 2000,
n.burnin = as.integer(n.iter/2),
n.thin = 1,
jags.module = c("glm", "dic"),
inits = NULL,
RNGname = c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper", "Mersenne-Twister"),
jags.seed = NULL,
stdout = NULL,
stderr = NULL,
progress.bar = "text",
refresh = 0,
tidy_eval = targets::tar_option_get("tidy_eval"),
packages = targets::tar_option_get("packages"),
library = targets::tar_option_get("library"),
format = "qs",
format_df = "fst_tbl",
repository = targets::tar_option_get("repository"),
error = targets::tar_option_get("error"),
memory = "transient",
garbage_collection = targets::tar_option_get("garbage_collection"),
deployment = targets::tar_option_get("deployment"),
priority = targets::tar_option_get("priority"),
resources = targets::tar_option_get("resources"),
storage = targets::tar_option_get("storage"),
retrieval = targets::tar_option_get("retrieval"),
cue = targets::tar_option_get("cue"),
description = targets::tar_option_get("description")
)
Arguments
name |
Symbol, base name for the collection of targets. Serves as a prefix for target names. |
jags_files |
Character vector of JAGS model files. If you
supply multiple files, each model will run on the one shared dataset
generated by the code in |
parameters.to.save |
Model parameters to save, passed to
|
data |
Code to generate the |
variables |
Character vector of model parameter names. The output posterior summaries are restricted to these variables. |
summaries |
List of summary functions passed to |
summary_args |
List of summary function arguments passed to
|
batches |
Number of batches. Each batch runs a model |
reps |
Number of replications per batch. Ideally, each rep
should produce its own random dataset using the code
supplied to |
combine |
Logical, whether to create a target to combine all the model results into a single data frame downstream. Convenient, but duplicates data. |
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
inits |
Initial values of model parameters, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
The |
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1, storage format of the data frames
of posterior summaries and other data frames returned by targets.
We recommend efficient data frame formats
such as |
format_df |
Character of length 1, storage format of the data frame
targets such as posterior draws. We recommend efficient data frame formats
such as |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Details
The MCMC targets use R2jags::jags()
if n.cluster
is 1
and
R2jags::jags.parallel()
otherwise. Most arguments to tar_jags()
are forwarded to these functions.
Value
tar_jags_rep_summary()
returns list of target objects.
See the "Target objects" section for
background.
The target names use the name
argument as a prefix, and the individual
elements of jags_files
appear in the suffixes where applicable.
As an example, the specific target objects returned by
tar_jags_rep_dic(name = x, jags_files = "y.jags")
are as follows.
-
x_file_y
: reproducibly track the JAGS model file. Returns a character vector of length 1 with the path to the JAGS model file. -
x_lines_y
: read the contents of the JAGS model file for safe transport to parallel workers. Returns a character vector of lines in the model file. -
x_data
: use dynamic branching to generate multiple JAGS datasets from the R expression in thedata
argument. Each dynamic branch returns a batch of JAGS data lists. -
x_y
: run JAGS on each dataset fromx_data
. Each dynamic branch returns a tidy data frame of summaries for each batch of data. -
x
: combine all the batches fromx_y
into a non-dynamic target. Suppressed ifcombine
isFALSE
. Returns a long tidy data frame with all summaries from all the branches ofx_y
.
Seeds
Rep-specific random number generator seeds for the data and models
are automatically set based on the batch, rep,
parent target name, and tar_option_get("seed")
. This ensures
the rep-specific seeds do not change when you change the batching
configuration (e.g. 40 batches of 10 reps each vs 20 batches of 20
reps each). Each data seed is in the .seed
list element of the output,
and each JAGS seed is in the .seed column of each JAGS model output.
Target objects
Most stantargets
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Examples
if (identical(Sys.getenv("TAR_JAGS_EXAMPLES"), "true")) {
targets::tar_dir({ # tar_dir() runs code from a temporary directory.
targets::tar_script({
library(jagstargets)
# Do not use a temp file for a real project
# or else your targets will always rerun.
tmp <- tempfile(pattern = "", fileext = ".jags")
tar_jags_example_file(tmp)
list(
tar_jags_rep_summary(
your_model,
jags_files = tmp,
data = tar_jags_example_data(),
parameters.to.save = "beta",
batches = 2,
reps = 2,
stdout = R.utils::nullfile(),
stderr = R.utils::nullfile()
)
)
}, ask = FALSE)
targets::tar_make()
})
}
Run a JAGS model and return the whole output object.
Description
Not a user-side function. Do not invoke directly.
Usage
tar_jags_run(
jags_lines,
parameters.to.save,
data,
inits,
n.cluster,
n.chains,
n.iter,
n.burnin,
n.thin,
jags.module,
RNGname,
jags.seed,
stdout,
stderr,
progress.bar,
refresh
)
Arguments
jags_lines |
Character vector of lines from a JAGS model file. |
parameters.to.save |
Model parameters to save, passed to
|
inits |
Initial values of model parameters, passed to
|
n.cluster |
Number of parallel processes, passed to
|
n.chains |
Number of MCMC chains, passed to
|
n.iter |
Number if iterations (including warmup), passed to
|
n.burnin |
Number of warmup iterations, passed to
|
n.thin |
Thinning interval, passed to
|
jags.module |
Character vector of JAGS modules to load, passed to
|
RNGname |
Choice of random number generator, passed to
|
jags.seed |
Seeds to apply to JAGS, passed to
|
stdout |
Character of length 1, file path to write the stdout stream
of the model when it runs. Set to |
stderr |
Character of length 1, file path to write the stderr stream
of the model when it runs. Set to |
progress.bar |
Type of progress bar, passed to
|
refresh |
Frequency for refreshing the progress bar, passed to
|
Value
An R2jags
output object.