This release collects all changes since the last CRAN version (0.7.4), including the previously GitHub-only 0.7.5 and 0.7.6 development versions.
make_newdata() output no longer contains internal PED
columns (tstart, intlen,
interval, offset, ped_status).
Output now contains tend + id + user
covariates (plus cause/transition for
competing risks / multi-state models). ped_info() output is
unchanged. intlen is reconstructed on demand by downstream
functions (add_cumu_hazard(), add_surv_prob(),
add_cif(), add_trans_prob()) and dropped from
user-facing output.add_cif() now uses the exact closed-form integral of
the cumulative incidence function under piecewise-exponential hazards
instead of the previous left-Riemann approximation. CIF estimates from
existing user code change numerically; results are now invariant to the
time grid passed to make_newdata().ci_type = "sim")
now use type-6 empirical quantiles instead of the
stats::quantile() default (type 7). Type-7 quantiles made
these intervals systematically too narrow for small nsim
(at the default nsim = 100 the enclosed central mass is
~93% rather than the nominal 95%); type-6 removes this inward bias
(#288). As a result, all simulation-based CI bounds change slightly
(intervals widen at both ends) relative to versions <= 0.7.4.scam::scam() (#286): the post-processing workflow
(add_hazard(), add_cumu_hazard(),
add_surv_prob(), add_term(),
add_cif(), add_trans_prob(),
get_cumu_coef(), get_cumu_eff(),
tidy_fixed(), tidy_smooth(),
gg_smooth(), …) now works for scam fits
exactly as for gam fits, including delta-method and
simulation-based confidence intervals. The calculations correctly use
the re-parametrized coefficients ($coefficients.t) and
their covariance ($Vp.t). pamm() gained
engine = "scam". See the new “Shape-constrained effects
(scam)” article.Surv(L, R, type = "interval2") are detected automatically
by as_ped(). The new pamm_ic() (single event)
and pamm_ic_cr() (competing risks) fit a PAMM by repeatedly
drawing exact event times from the model-based conditional hazard
distribution on (L, R] and re-fitting the standard
right-censored pipeline. Inference pools the imputations:
add_hazard(), add_cumu_hazard(),
add_surv_prob() and add_cif() gain
pamm_ic methods that combine per-imputation posterior draws
(within- plus between-imputation variance). The iter
argument enables chained (refit-and-reimpute) imputation, recommended
for sparsely inspected data. add_inspections() turns exact
simulated times (e.g. from sim_pexp()) into
interval-censored panel data for testing and coverage studies.
print()/summary() of a pamm_ic
report the pooled (Rubin-combined) fit. See the new
“Interval-Censored Data” vignette.add_*() quantities and their
delta-method and simulation-based CIs are derived from two internal S3
primitives, get_hazard() and sim_hazard(), so
an alternative estimation backend only needs to provide methods for
those two. The new “Defining a new backend: gradient boosting with
xgboost” vignette demonstrates this end-to-end (and the Bayesian
vignette was reworked to use the same unified interface).gg_state_occupation() is now exported.gg_smooth() is now fully general across univariate
smooth terms: a bare variable name selects every 1d smooth over that
variable (main effect plus any by-variable or factor-smooth
interaction term), terms is optional (defaulting to all
univariate smooths), and 1d ti() as well as factor-smooth
interactions (bs = "fs", bs = "sz") are
supported. Factor-indexed smooths are drawn in a single facet with one
curve per factor level, identified by a new level column in
the get_terms() output. Random-effect smooths
(bs = "re", bs = "mrf") and
multivariate/tensor smooths are excluded (use gg_re() /
gg_tensor()).add_cif() now supports arbitrary time points in
make_newdata() (parity with
add_cumu_hazard()); missing breakpoints are inserted
internally so CIF estimates are independent of the chosen prediction
grid.add_surv_prob(), add_cif(),
add_trans_prob() and add_cumu_hazard() now
include plotting boundary rows at tend = 0 (or the selected
time_var). Boundary values are set to their known limits,
S(0) = 1, CIF(0) = 0, off-diagonal transition
probabilities P_rs(0) = 0, and
cumu_hazard = 0, with collapsed confidence-interval bounds
when requested. Boundary rows are added only for continuous-time models
(gam/scam/pamm).add_cif() /
add_trans_ci()); single-group results are unchanged.get_trans_prob() now supports non-integer (categorical)
state labels (e.g. "healthy->ill") in addition to
integer-coded transitions.expand_df() preserves the cause column
when make_newdata() is called with only tend
and cause, fixing a competing-risks edge case.predictSurvProb.pamm() now respects non-default
id column names and works when trafo_args are
not attached to the fitted object.add_trans_prob() and add_trans_ci() no
longer require the input data to be pre-sorted (#255, related to
#227).add_trans_prob() / add_trans_ci() /
get_trans_prob() now consistently thread
time_var and interval_length, fixing argument
forwarding for nonstandard column names.add_counterfactual_transitions() now fully honors
from_col, to_col, and
transition_col.gg_smooth() / get_terms() now select
smooth terms via the model’s mgcv smooth metadata instead
of unanchored grep(), fixing two errors reported in #283
(variable names matched by several smooths, and factor terms). Names
that match no smooth are skipped with a warning rather than
erroring..glm/.pamm methods for the
internal warn_about_new_time_points() generic (previously
“no applicable method”).pamm_ic adders now warn when given under-grouped
newdata.trafo_args argument of pamm() is
deprecated; convert data with as_ped() before calling
pamm().id to global variables for dplyr
compatibility (#260)add_trans_prob: better documentation, proper
examples, attribute attachment, and base R speeduppamm() when data does not contain an
offset columnbroom to Suggestsadd_trans_prob help page with proper parameter
descriptions and working examplegeom_stepribbonmethods from pamm. Can be
specified via .... Fixes #200warn_about_new_time_points when original data
not stored in model object. Fixes #203split_data function that now accepts
Surv(start, stop, event) type inputs, e.g., to construct
left-truncated data.as_ped.ped now also works for transformations with
time-dependent covariatespamm,
which is a thin wrapper around mgcv::gam with some
arguments pre-set.predictSurvProb.pammpecas_ped changed. The vertical bar | is no
longer necessary to indicate concurrent or cumulative effectsFunctions get_hazard and add_hazard
also gain reference argument. Allows to calculate
(log-)hazard ratios.
Introduces breaking changes to add_term function.
Argument relative is replaced by reference,
makes calculation of relative (log-)hazards, i.e. hazard ratios, more
flexible. Argument se.fit is replaced by
ci.
dplyr reverse dependency
(see #101)make_newdataconcurrent now has a lag = 0 argument, can
be set to positive integer valuesas_ped accepts multiple concurrent
specials with different lag specificationsmake-newdata.fpedgg_laglead and
gg_partial_ll did not calculate the lag-lead-window
correctly when applied to ped datamake_newdata loses arguments expand and
n and gains ... where arbitrary covariate
specifications can be placed, i.e.
e.g. age=seq_range(age, n=20). Multiple such expression can
be provided and a data frame with one row for each combination of the
evaluated expressions will be returned. All variables not specified in
will be set to respective mean or modus values. For data of class
ped or fped make_newdata will try
to specify time-dependent variables intelligently.
te_var argument in concurrent and
cumulative was renamed to tz_var
te arguments have been replaced by tz
(time points at which z was observed) in all functions to
avoid confusion with mgcv::te (e.g.,
gg_laglead)
Overall better support for cumulative effects
Added convenience functions for work with cumulative effects, namely
gg_partial andgg_sliceAdded helper functions to calculate and visualize Lag-lead windows
get_lagleadgg_lagleadAdded convenience geoms for piece-wise constant
hazards (see examples in ?geom_hazard, cumulative hazards
and survival probabilities (usually
aes(x=time, y = surv_prob), but data set doesn’t contain
extra row for time = 0), thus
geom_stephazard adds row (x=0, y = y[1]) to the data
before plottinggeom_hazard adds row (x = 0, y = 0) before plotting
(can also be used for cumulative hazard)geom_surv add row (x = 0, y = 1) before plottingAll data transformation is now handled using as_ped
(see data
transformation vignette)
Data transformation now handles
Added functionality to flexibly simulate data from PEXP including
cumulative effects, see ?sim_pexp
Added functionality to calculate Aalen-model style cumulative
coefficients, see ?cumulative_coefficient
Breaking change in split_data (as_ped
now main data trafo function):
max.end argumentmax_time argument to introduce administrative
censoring at max_time when no custom interval split points
are providedtidyeval adaptationstidyevalpamm package to pammtools due to
naming conflicts with PAMM package on CRAN