---
title: "Mathematical Foundations of TemporalHazard"
subtitle: "Generalized decomposition, multiphase additive hazard, censoring, and time-varying covariates"
format:
  html:
    code-fold: false
    toc: true
    toc-depth: 3
    number-sections: true
vignette: >
  %\VignetteIndexEntry{Mathematical Foundations of TemporalHazard}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}
---

```{r}
#| include: false
has_pkg <- requireNamespace("TemporalHazard", quietly = TRUE)
has_ggplot <- requireNamespace("ggplot2", quietly = TRUE)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = has_pkg,
  fig.width = 7,
  fig.height = 4.5
)
```

```{r}
#| label: setup
library(TemporalHazard)
```

This vignette lays out the mathematical foundation for the models in
TemporalHazard. The other vignettes treat the package as a black box
you call `hazard()` on; this one opens the box. If you've used the
package and want to know *what* it's computing, *why* the
parameterization is what it is, and *how* the multiphase
decomposition relates to standard parametric survival families, this
is the place.

The four pieces, in order: the generalized temporal decomposition
family (Blackstone, Naftel, and Turner 1986) that every phase is
built from; the additive multiphase hazard model that composes
phases into a full hazard; the maximum-likelihood objective under
mixed censoring (right, left, interval, counting-process), which is
what the optimizer minimizes; and extensions for phase-specific and
time-varying covariates. The same decomposition framework carries
over to longitudinal mixed-effects settings (Rajeswaran et al.
2018), though this vignette stays in the survival regime.

For users migrating from the legacy SAS/C HAZARD program, each
section notes how the R parameterization relates to the original
parameter names. For the full translation table, see
`vignette("sas-to-r-migration")` or call `hzr_argument_mapping()`.


# Generalized Temporal Decomposition {#sec-decomposition}

## The parametric family

Every temporal phase in TemporalHazard is built from a single parametric
family, `decompos(t; t_half, nu, m)`, introduced by Blackstone, Naftel, and
Turner (1986) that produces three linked quantities from just three parameters:

- $G(t)$ — cumulative distribution function (CDF), bounded $[0, 1]$
- $g(t) = dG/dt$ — probability density function
- $h(t) = g(t) / (1 - G(t))$ — hazard function

The three parameters control the shape of the distribution:

| Parameter | Meaning | Constraint |
|-----------|---------|------------|
| `t_half`  | Half-life: time at which $G(t_{1/2}) = 0.5$ | $> 0$ |
| `nu`      | Time exponent — controls rate dynamics | any finite value |
| `m`       | Shape exponent — controls distributional form | any finite value |

::: {.callout-note}
## SAS/C parameter bridge

The original C code used different parameter names per phase:

- **Early phase (G1):** `DELTA`, `RHO`/`THALF`, `NU`, `M` $\to$ `t_half`, `nu`, `m`
- **Late phase (G3):** `TAU`, `GAMMA`, `ALPHA`, `ETA` $\to$ `t_half`, `nu`, `m`

Both collapse onto the same 3-parameter decomposition family.  The C `DELTA`
parameter controlled a time transformation $B(t) = (\exp(\delta t) - 1)/\delta$
that is absorbed into the shape.
:::

In R, `hzr_decompos()` computes all three quantities:

```{r}
#| label: basic-decompos
t_grid <- seq(0.01, 10, length.out = 200)

# Standard sigmoidal (Case 1: m > 0, nu > 0)
d <- hzr_decompos(t_grid, t_half = 3, nu = 2, m = 1)

# Verify the half-life property: G(t_half) = 0.5
d_half <- hzr_decompos(3, t_half = 3, nu = 2, m = 1)
cat("G(t_half) =", round(d_half$G, 6), "\n")

# Verify the identity: h(t) = g(t) / (1 - G(t))
h_check <- d$g / (1 - d$G)
cat("Max |h - g/(1-G)| =", max(abs(d$h - h_check)), "\n")
```


## The six valid cases

Different sign combinations of $\nu$ and $m$ pick out different
asymptotic behaviors of `decompos()`, and the package treats each as
a separate case for numerical stability. The combination $m < 0$
**and** $\nu < 0$ produces an integrand with no finite normalization
and is rejected at the input-validation step. The remaining six are
the catalog every phase in the package selects from:

```{r}
#| label: six-cases-table
#| echo: false
cases <- data.frame(
  Case = c("1", "1L", "2", "2L", "3", "3L"),
  m     = c("> 0", "= 0", "< 0", "< 0", "> 0", "= 0"),
  nu    = c("> 0", "> 0", "> 0", "= 0", "< 0", "< 0"),
  Behavior = c(
    "Standard sigmoidal",
    "Weibull-like (exponential limit as m -> 0)",
    "Heavy-tailed",
    "Exponential decay",
    "Bounded cumulative",
    "Bounded exponential"
  )
)
knitr::kable(cases, caption = "Six valid parameter sign combinations")
```

The following plot shows the CDF $G(t)$, density $g(t)$, and hazard $h(t)$
for each case, all with `t_half = 3`:

```{r}
#| label: fig-six-cases
#| fig-cap: "G(t), g(t), and h(t) for all six valid decomposition cases"
#| fig-height: 8
if (has_ggplot) {
  library(ggplot2)

  params <- list(
    "Case 1: m=1, nu=2"     = list(nu = 2,   m = 1),
    "Case 1L: m=0, nu=2"    = list(nu = 2,   m = 0),
    "Case 2: m=-0.5, nu=2"  = list(nu = 2,   m = -0.5),
    "Case 2L: m=-0.5, nu=0" = list(nu = 0,   m = -0.5),
    "Case 3: m=1, nu=-0.5"  = list(nu = -0.5, m = 1),
    "Case 3L: m=0, nu=-0.5" = list(nu = -0.5, m = 0)
  )

  t_grid <- seq(0.01, 10, length.out = 200)
  rows <- list()

  for (nm in names(params)) {
    p <- params[[nm]]
    d <- hzr_decompos(t_grid, t_half = 3, nu = p$nu, m = p$m)
    rows <- c(rows, list(data.frame(
      time = rep(t_grid, 3),
      value = c(d$G, d$g, d$h),
      quantity = rep(c("G(t) — CDF", "g(t) — density", "h(t) — hazard"),
                     each = length(t_grid)),
      case = nm
    )))
  }

  df <- do.call(rbind, rows)

  # Cap hazard display at reasonable values for readability
  df$value[df$quantity == "h(t) — hazard" & df$value > 5] <- NA

  ggplot(df, aes(x = time, y = value)) +
    geom_line(linewidth = 0.6) +
    facet_grid(quantity ~ case, scales = "free_y") +
    labs(x = "Time", y = NULL) +
    theme_minimal(base_size = 10) +
    theme(strip.text = element_text(size = 7))
}
```


## Derivation sketch

The construction begins with a rate function $\rho$ chosen so that
$G(t_{1/2}) = 0.5$ exactly.  For Case 1 ($m > 0, \nu > 0$):

$$
\rho = \nu \, t_{1/2} \left(\frac{2^m - 1}{m}\right)^{\!\nu}
$$

Then, with the dimensionless time $b(t) = \nu t / \rho$:

$$
G(t) = \left(1 + m \, b(t)^{-1/\nu}\right)^{-1/m}
$$

$$
g(t) = \left(1 + m \, b(t)^{-1/\nu}\right)^{-1/m - 1}
       \cdot b(t)^{-1/\nu - 1} / \rho
$$

The other five cases arise as limits ($m \to 0$, $\nu \to 0$) or sign
reflections of this base form.  The implementation in `hzr_decompos()`
dispatches to the appropriate branch after checking the signs of `nu` and `m`.


# Additive Multiphase Hazard Model {#sec-multiphase}

## Model specification

The total cumulative hazard decomposes additively across $J$ phases:

$$
H(t \mid \mathbf{x}) = \sum_{j=1}^{J} \mu_j(\mathbf{x}) \cdot \Phi_j(t;\, t_{1/2,j},\, \nu_j,\, m_j)
$$

where:

- $\mu_j(\mathbf{x}) = \exp(\alpha_j + \mathbf{x}_j \boldsymbol{\beta}_j)$ is the phase-specific log-linear scale with covariates
- $\Phi_j(t)$ is the temporal shape, determined by the phase type

The three phase types correspond to different uses of the decomposition output:

| Phase type | $\Phi_j(t)$ | Domain | Interpretation |
|------------|-------------|--------|----------------|
| `"cdf"`    | $G(t)$ | $[0, 1]$ | Early risk that resolves over time |
| `"hazard"` | $-\log(1 - G(t))$ | $[0, \infty)$ | Late or aging risk that accumulates |
| `"constant"` | $t$ | $[0, \infty)$ | Flat background rate (no shape parameters) |

::: {.callout-note}
## SAS/C bridge

The classic three-phase HAZARD model used:

- **G1** (early) $\to$ `hzr_phase("cdf", ...)`
- **G2** (constant) $\to$ `hzr_phase("constant")`
- **G3** (late) $\to$ `hzr_phase("hazard", ...)`

TemporalHazard generalizes this to $N$ phases of any type.
:::


## Derived quantities

From the cumulative hazard, the instantaneous hazard and survival are:

$$
h(t \mid \mathbf{x}) = \sum_{j=1}^{J} \mu_j(\mathbf{x}) \cdot \varphi_j(t)
\qquad\text{where } \varphi_j = d\Phi_j/dt
$$

$$
S(t \mid \mathbf{x}) = \exp\!\bigl(-H(t \mid \mathbf{x})\bigr)
$$

The derivative $\varphi_j(t)$ depends on phase type:

- `"cdf"`: $\varphi(t) = g(t)$ (the density)
- `"hazard"`: $\varphi(t) = h(t) = g(t)/(1 - G(t))$
- `"constant"`: $\varphi(t) = 1$


## Constructing phases in R

The math above is exposed in R through `hzr_phase()`, which builds
one phase at a time, and `hazard()`, which composes phases into a
full model. The same `(t_half, nu, m)` triple from `decompos()`
parameterizes the `"cdf"` and `"hazard"` shapes; the `"g3"` shape
exposes its own `(tau, gamma, alpha, eta)` parameters from the SAS
"late" library; and `"constant"` has no shape parameters at all. A
canonical three-phase model uses the `"cdf"` shape for the early
phase, `"constant"` for the background, and either a delayed
`"cdf"` or `"g3"` for the late phase.

```{r}
#| label: phase-construction
# Classic three-phase pattern
early <- hzr_phase("cdf",      t_half = 0.5, nu = 2, m = 0)
const <- hzr_phase("constant")
late  <- hzr_phase("hazard",   t_half = 5,   nu = 1, m = 0)

print(early)
print(const)
print(late)
```

The cumulative hazard contribution $\Phi(t)$ and instantaneous hazard
$\varphi(t)$ for each phase can be computed directly:

```{r}
#| label: fig-phase-shapes
#| fig-cap: "Phi(t) and phi(t) for each phase type"
#| fig-height: 5
if (has_ggplot) {
  t_grid <- seq(0.01, 12, length.out = 200)

  phi_early <- hzr_phase_cumhaz(t_grid, t_half = 0.5, nu = 2, m = 0,
                                 type = "cdf")
  phi_late  <- hzr_phase_cumhaz(t_grid, t_half = 5, nu = 1, m = 0,
                                 type = "hazard")
  phi_const <- hzr_phase_cumhaz(t_grid, type = "constant")

  dphi_early <- hzr_phase_hazard(t_grid, t_half = 0.5, nu = 2, m = 0,
                                  type = "cdf")
  dphi_late  <- hzr_phase_hazard(t_grid, t_half = 5, nu = 1, m = 0,
                                  type = "hazard")
  dphi_const <- hzr_phase_hazard(t_grid, type = "constant")

  df <- data.frame(
    time = rep(t_grid, 6),
    value = c(phi_early, phi_late, phi_const,
              dphi_early, dphi_late, dphi_const),
    phase = rep(rep(c("cdf (early)", "hazard (late)", "constant"),
                    each = length(t_grid)), 2),
    quantity = rep(c("Phi(t) — cumulative", "phi(t) — instantaneous"),
                   each = 3 * length(t_grid))
  )

  ggplot(df, aes(x = time, y = value, colour = phase)) +
    geom_line(linewidth = 0.8) +
    facet_wrap(~ quantity, scales = "free_y", ncol = 1) +
    scale_colour_manual(values = c(
      "cdf (early)" = "#0072B2",
      "hazard (late)" = "#D55E00",
      "constant" = "#009E73"
    )) +
    labs(x = "Time", y = NULL, colour = "Phase type") +
    theme_minimal(base_size = 11) +
    theme(legend.position = "bottom")
}
```


## Additive composition example

To build intuition, here is how three phases combine into a total hazard.
The key insight is that phases contribute additively to the **cumulative
hazard** $H(t)$, not to the survival directly.

```{r}
#| label: fig-additive-demo
#| fig-cap: "Three-phase additive cumulative hazard: total = early + constant + late"
#| fig-height: 4
if (has_ggplot) {
  t_grid <- seq(0.01, 15, length.out = 300)

  mu_early <- 0.3
  mu_const <- 0.05
  mu_late  <- 0.2

  H_early <- mu_early * hzr_phase_cumhaz(t_grid, t_half = 0.5, nu = 2,
                                           m = 0, type = "cdf")
  H_const <- mu_const * hzr_phase_cumhaz(t_grid, type = "constant")
  H_late  <- mu_late  * hzr_phase_cumhaz(t_grid, t_half = 5, nu = 1,
                                           m = 0, type = "hazard")
  H_total <- H_early + H_const + H_late

  df <- data.frame(
    time = rep(t_grid, 4),
    cumhaz = c(H_early, H_const, H_late, H_total),
    component = rep(c("Early (cdf)", "Constant", "Late (hazard)", "Total"),
                    each = length(t_grid))
  )
  df$component <- factor(df$component,
    levels = c("Total", "Early (cdf)", "Constant", "Late (hazard)"))

  ggplot(df, aes(x = time, y = cumhaz, colour = component,
                 linewidth = component)) +
    geom_line() +
    scale_colour_manual(values = c(
      "Total" = "black", "Early (cdf)" = "#0072B2",
      "Constant" = "#009E73", "Late (hazard)" = "#D55E00"
    )) +
    scale_linewidth_manual(values = c(
      "Total" = 1.2, "Early (cdf)" = 0.6,
      "Constant" = 0.6, "Late (hazard)" = 0.6
    )) +
    labs(x = "Time", y = "Cumulative hazard H(t)",
         colour = NULL, linewidth = NULL) +
    theme_minimal(base_size = 11) +
    theme(legend.position = "bottom")
}
```


# Maximum Likelihood Estimation {#sec-likelihood}

## Log-likelihood under mixed censoring

TemporalHazard supports four censoring types in a single dataset, coded
by the `status` indicator:

| Code | Type | Contribution to log-likelihood |
|------|------|-------------------------------|
| $\delta_i = 1$ | Exact event | $\log h(t_i \mid \mathbf{x}_i) - H(t_i \mid \mathbf{x}_i)$ |
| $\delta_i = 0$ | Right-censored | $-H(t_i \mid \mathbf{x}_i)$ |
| $\delta_i = -1$ | Left-censored | $\log\bigl(1 - \exp(-H(u_i \mid \mathbf{x}_i))\bigr)$ |
| $\delta_i = 2$ | Interval-censored | $-H(l_i \mid \mathbf{x}_i) + \log\bigl(1 - \exp(-(H(u_i) - H(l_i)))\bigr)$ |

where $l_i$ and $u_i$ are the lower and upper bounds of the censoring
interval.

The full log-likelihood is:

$$
\ell(\boldsymbol{\theta}) = \sum_{i:\,\delta_i=1}
  \Bigl[\log h(t_i \mid \mathbf{x}_i) - H(t_i \mid \mathbf{x}_i)\Bigr]
  - \sum_{i:\,\delta_i=0} H(t_i \mid \mathbf{x}_i)
$$
$$
  + \sum_{i:\,\delta_i=-1}
    \log\!\bigl(1 - e^{-H(u_i \mid \mathbf{x}_i)}\bigr)
  + \sum_{i:\,\delta_i=2}
    \Bigl[-H(l_i \mid \mathbf{x}_i)
    + \log\!\bigl(1 - e^{-(H(u_i) - H(l_i))}\bigr)\Bigr]
$$

The $\log(1 - e^{-x})$ terms are computed using the numerically stable
primitive `hzr_log1mexp()`, which avoids catastrophic cancellation when $x$
is near zero.

```{r}
#| label: log1mexp-demo
# Naive log(1 - exp(-x)) fails near zero
x_small <- 1e-15
cat("Naive:  ", log(1 - exp(-x_small)), "\n")
cat("Stable: ", hzr_log1mexp(x_small), "\n")
```


## Internal parameterization

During optimization, shape parameters are reparameterized for unconstrained
search.  For each phase $j$, the internal parameter sub-vector is:

| Parameter | Internal (estimation) | User-facing (reported) |
|-----------|----------------------|----------------------|
| Scale     | $\log \mu_j$ | $\mu_j = \exp(\log \mu_j)$ |
| Half-life | $\log t_{1/2,j}$ | $t_{1/2,j} = \exp(\log t_{1/2,j})$ |
| Time exponent | $\nu_j$ (unconstrained) | $\nu_j$ |
| Shape     | $m_j$ (unconstrained) | $m_j$ |
| Covariates | $\beta_{j,1}, \ldots, \beta_{j,p_j}$ | same |

Constant phases omit `log_t_half`, `nu`, and `m`, contributing only one
shape-free parameter (`log_mu`).  The full $\boldsymbol{\theta}$ is the
concatenation of all phase sub-vectors.


## Optimization strategy

The optimizer uses `stats::optim()` with BFGS on the unconstrained internal
scale, wrapped by `.hzr_optim_generic()`.  Key features:

- **Multi-start**: the optimizer runs from $k$ random perturbations around
  the user-supplied starting values (default $k = 5$, controlled by
  `control$n_starts`).  The run with the highest log-likelihood is kept.
- **Feasibility guard**: any parameter vector where $m < 0$ **and**
  $\nu < 0$ for the same phase returns $-\infty$ immediately.
- **Post-fit Hessian**: the numerical Hessian at the solution is inverted
  to produce the variance-covariance matrix.  Standard errors are
  $\sqrt{\text{diag}(\hat{V})}$.


# Covariates and Phase-Specific Formulas {#sec-covariates}

## Global covariates

By default, every phase shares the same covariate vector from the model
formula.  Each phase gets its own coefficient vector $\boldsymbol{\beta}_j$:

$$
\mu_j(\mathbf{x}) = \exp\!\bigl(\alpha_j + \mathbf{x}\,\boldsymbol{\beta}_j\bigr)
$$

So the same predictors can have different effects on early vs. late risk,
which is a core feature of multiphase models.

```{r}
#| label: global-covariates
#| eval: false
# Both phases use age and nyha from the global formula
hazard(
  survival::Surv(time, status) ~ age + nyha,
  data   = dat,
  dist   = "multiphase",
  phases = list(
    early = hzr_phase("cdf",    t_half = 0.5, nu = 2, m = 0),
    late  = hzr_phase("hazard", t_half = 5,   nu = 1, m = 0)
  ),
  fit = TRUE
)
```


## Phase-specific covariates

The global formula above lets every covariate act on every phase
through a shared shift $x^\top \beta$. That's the right structure
when a covariate plausibly affects the entire hazard trajectory. But
clinical reality often pulls the other way: surgical risk factors
(grade IV complications, intra-operative blood loss) belong in the
early phase but say nothing about late attrition; chronic
comorbidities (NYHA class progression, late-onset diabetes) belong
in the late phase but were irrelevant during the operative window.
Phase-specific covariates let you build that structure directly.

When a phase carries its own one-sided formula, the phase gets its
own design matrix from the data, and only those covariates appear
in that phase's $\boldsymbol{\beta}_j$. Other phases ignore them
entirely.

```{r}
#| label: phase-specific-covariates
#| eval: false
# Early risk depends on surgical factors; late risk on chronic conditions
hazard(
  survival::Surv(time, status) ~ age + nyha + shock,
  data   = dat,
  dist   = "multiphase",
  phases = list(
    early = hzr_phase("cdf",    t_half = 0.5, nu = 2, m = 0,
                      formula = ~ age + shock),
    late  = hzr_phase("hazard", t_half = 5,   nu = 1, m = 0,
                      formula = ~ age + nyha)
  ),
  fit = TRUE
)
```

When a phase has a `formula`, it gets its own design matrix built from the
data, and only those covariates appear in its $\boldsymbol{\beta}_j$.  The
parameter vector is shorter for phases with fewer covariates, which can
improve identifiability and convergence.


## Time-varying coefficients

For single-distribution models, TemporalHazard supports piecewise
time-varying coefficients via the `time_windows` argument.  This expands the
design matrix into window-specific blocks:

$$
\eta_i = \sum_{k=1}^{K} \mathbf{x}_i \boldsymbol{\beta}_k \cdot
\mathbf{1}(t_i \in W_k)
$$

where $W_1, \ldots, W_K$ are the time windows defined by the cut points.
Each predictor gets a separate coefficient in each window, allowing effects
to change over follow-up time.

```{r}
#| label: time-varying-demo
#| eval: false
# Two windows: [0, 2] and (2, infinity)
hazard(
  survival::Surv(time, status) ~ age + nyha,
  data         = dat,
  theta        = c(mu = 0.25, nu = 1.1,
                   age_w1 = 0, nyha_w1 = 0,
                   age_w2 = 0, nyha_w2 = 0),
  dist         = "weibull",
  time_windows = 2,
  fit          = TRUE
)
```

::: {.callout-tip}
## Multiphase vs. time-varying coefficients

For multiphase models, the phase structure itself captures time-varying
effects: early phases dominate at small $t$ and late phases at large $t$.
The `time_windows` mechanism is a complementary approach for
single-distribution models.
:::


# Identifiability and Practical Considerations {#sec-identifiability}

## Parameter identifiability

Multiphase models with many free parameters can have flat or multi-modal
likelihood surfaces.  Practical guidelines:

1. **Fix shape parameters when possible.**  If clinical knowledge suggests a
   specific temporal pattern (e.g. early mortality follows a Weibull shape
   with $m = 0$), fix `m` in the `hzr_phase()` starting values and inspect
   whether the optimizer moves it.

2. **Start from the SAS/C estimates.**  If legacy results are available,
   translate them using `hzr_argument_mapping()` and supply as starting values.

3. **Use multi-start optimization.**  The default `control$n_starts = 5`
   helps escape local optima, but complex models may benefit from more starts.

4. **Phase-specific covariates improve identifiability.**  Restricting each
   phase to clinically relevant predictors reduces the parameter count and
   sharpens the likelihood surface.


## Numerical stability

The decomposition engine applies several guards:

- Time is clamped above `.Machine$double.xmin` to avoid $0^{\text{negative}}$
- $1 - G(t)$ is clamped above `.Machine$double.xmin` before computing the
  hazard $h(t) = g(t)/(1 - G(t))$
- Left- and interval-censored contributions use `hzr_log1mexp()` to avoid
  $\log(0)$ when $H(t)$ is very small

Together these keep the gradients finite throughout the optimization, even in
regions of parameter space far from the optimum.


# Summary of Key Functions {#sec-api}

| Function | Purpose |
|----------|---------|
| `hzr_decompos(t, t_half, nu, m)` | Core decomposition: returns $G$, $g$, $h$ |
| `hzr_phase_cumhaz(t, ..., type)` | Phase cumulative hazard $\Phi(t)$ |
| `hzr_phase_hazard(t, ..., type)` | Phase instantaneous hazard $\varphi(t)$ |
| `hzr_phase(type, t_half, nu, m, formula)` | Construct a phase specification |
| `hazard(..., dist = "multiphase", phases = ...)` | Fit a multiphase model |
| `predict(fit, type, decompose = TRUE)` | Per-phase decomposed predictions |
| `hzr_argument_mapping()` | SAS/C $\to$ R parameter translation table |
| `hzr_log1mexp(x)` | Stable $\log(1 - e^{-x})$ |

For a worked clinical example, see `vignette("getting-started")`.  For
migration from SAS HAZARD, see `vignette("sas-to-r-migration")`.


# References {.unnumbered}

Blackstone EH, Naftel DC, Turner ME Jr. The decomposition of time-varying
hazard into phases, each incorporating a separate stream of concomitant
information. *J Am Stat Assoc.* 1986;81(395):615–624.
doi: [10.1080/01621459.1986.10478314](https://doi.org/10.1080/01621459.1986.10478314)

Rajeswaran J, Blackstone EH, Ehrlinger J, Li L, Ishwaran H, Parides MK.
Probability of atrial fibrillation after ablation: Using a parametric
nonlinear temporal decomposition mixed effects model. *Stat Methods Med Res.*
2018;27(1):126–141.
doi: [10.1177/0962280215623583](https://doi.org/10.1177/0962280215623583)