This vignette documents the implementation of the independent Normal-Gamma extension for Gaussian regression, focusing on the internal workflow of:
rIndepNormalGammaReg (non-_std) - the
top-level director of traffic for the independent Normal-Gamma sampling
path; andEnvelopeOrchestrator - which composes the
coefficient-envelope construction (EnvelopeBuild) with the
dispersion-aware refinement (EnvelopeDispersionBuild).The discussion then deep-dives into the accept-reject
sampler used after the envelope is constructed:
rIndepNormalGammaReg_std (serial) /
rIndepNormalGammaReg_std_parallel (parallel).
Theory justification and the underlying accept-reject / envelope construction are described in Chapter A07 (the independent Normal-Gamma extension theory) and in (Nygren and Nygren 2006) for the base likelihood subgradient approach. Implementation of independent Normal–Gamma regression is discussed in (Griffin and Brown 2010).
For the fixed-dispersion coefficient-envelope implementation details, see Chapter A05.
Prior construction and the Gaussian calibration of
shape, rate, dispersion, and
coefficient covariances returned by Prior_Setup() are
derived in Chapter A12 (https://knygren.r-universe.dev/articles/glmbayes/Chapter-A12.html).
rIndepNormalGammaReg
(non-_std) - Internal WorkflowThe entry point in C++ is src/rIndepNormalGammaReg.cpp,
function rIndepNormalGammaReg(...). At a high level, it
performs the following stages:
EnvelopeCentering() to obtain:
dispersion2 (the dispersion anchor used for the
envelope), andRSS_Post2 (the expected posterior weighted RSS used to
update the Gamma precision parameters).optim() (BFGS) to obtain:
bstar, andA1.glmb_Standardize_Model() to obtain standardized inputs for
envelope building: bstar2, A,
x2_std, mu2_std, P2_std, and
transform matrices for back-transformation.EnvelopeOrchestrator() (Step D):
EnvelopeBuild,EnvelopeDispersionBuild, andEnvelopeSort.rIndepNormalGammaReg_std (serial) orrIndepNormalGammaReg_std_parallel (parallel), depending
on use_parallel and n.EnvelopeCentering() for
dispersion2 and RSS_Post2rIndepNormalGammaReg calls:
centering_out = EnvelopeCentering(y, x, mu, P, offset, wt, shape, rate, Gridtype, verbose)
dispersion2 = centering_out$dispersion
RSS_Post2 = centering_out$RSS_postIn implementation terms, EnvelopeCentering(): -
initializes dispersion2 from a weighted residual variance
computed from lm.wfit, - iteratively refines
dispersion2 using a fixed-point style loop, and - updates
the Gamma parameters using the expected posterior weighted RSS.
The key output for the dispersion-aware refinement is
RSS_Post2, which drives the precision-rate update inside
EnvelopeOrchestrator.
bstar) and Hessian
(A1)After centering, rIndepNormalGammaReg performs posterior
mode optimization using R’s optim() with: - BFGS, -
supplied functions f2 and f3 (negative
log-posterior and its gradient), - hessian = TRUE to obtain
a Hessian estimate A1.
The mode bstar and curvature A1 are then
used to standardize the model into a form suitable for envelope
construction.
glmb_Standardize_Model)glmb_Standardize_Model() transforms the model so that
the prior mean is zero and the posterior curvature is represented in a
standardized coordinate system.
In implementation terms, the code extracts: - bstar2
(standardized mode), - A (standardized Hessian / precision
contribution), - x2_std, mu2_std,
P2_std (standardized design and prior components), -
L2Inv, L3Inv (matrices used later to
back-transform sampled coefficients).
For the full theoretical and implementation details of standardization, see Chapter A05.
EnvelopeOrchestrator() - envelope +
dispersion refinementrIndepNormalGammaReg delegates the construction of the
coefficient envelope and the dispersion-aware extension to
EnvelopeOrchestrator.
It passes: - standardized mode and curvature objects
(bstar2, A), - standardized design/prior terms
(x2_std, mu2_std, P2_std), -
offsets/weights (alpha, wt), - envelope grid
controls (n, Gridtype, n_envopt),
- Gamma prior hyperparameters (shape, rate), -
and centering outputs (RSS_Post2, plus bounds / truncation
controls).
The orchestrator returns: - Env (sorted envelope object
including the PLSD component), - gamma_list (tilt
parameters and dispersion bounds), - UB_list (the various
constants used by the sampler accept-reject test), - and additional
diagnostics including low / upp.
rIndepNormalGammaReg_std /
_parallelAfter envelope construction, the code selects: - serial sampler
rIndepNormalGammaReg_std(...), or - parallel sampler
rIndepNormalGammaReg_std_parallel(...), based on
use_parallel and n.
This is where the accept-reject loop actually executes and where
iters_out is recorded.
The standardized samples beta_out are mapped back to the
original parameterization using: - the inverse standardization
transforms (L2Inv, L3Inv), - and the prior
mean shift (mu).
The returned structure mirrors the expected R-level contract: -
samples (coefficients and dispersion), - plus runtime / iteration
diagnostics (iters_out, weight_out).
EnvelopeOrchestrator - Composition
of Envelope Stepssrc/EnvelopeOrchestrator.cpp provides a single
orchestrator that composes:
EnvelopeBuild() - fixed-dispersion coefficient
envelope, andEnvelopeDispersionBuild() - dispersion-aware refinement
for the independent Normal-Gamma extension, followed by:EnvelopeSort() on the R side.Inside EnvelopeOrchestrator, the Gamma posterior
parameters for the dispersion precision are computed from: - the prior
hyperparameters (shape, rate), - and the
centering output RSS_Post2.
The implementation computes: - shape2 = shape + n_w/2, -
rate3 = rate + RSS_Post2/2, - and the dispersion anchor:
d1_star = rate3 / (shape2 - 1).
This anchor is used to rescale weights
(wt2 = wt / d1_star) when building the coefficient
envelope.
EnvelopeBuild() call
(coefficient-envelope)The orchestrator calls EnvelopeBuild(...) in C++ with: -
standardized mode (bstar2), - standardized curvature /
precision object (A), - standardized design/prior pieces
(x2, mu2, P2), - offsets
(alpha) and weight scaling (wt2), -
family/link fixed to Gaussian/identity.
Implementation detail worth noting: the orchestrator uses internal policy around grid size and sorting to avoid redundant work when followed by the dispersion-aware build.
EnvelopeDispersionBuild() -
dispersion-aware extensionThis subsection focuses on EnvelopeDispersionBuild() as
the extension-specific computational workhorse for
dispersion.
RSS_Post2EnvelopeDispersionBuild receives: - the
coefficient-envelope object Env from
EnvelopeBuild, - prior hyperparameters (shape,
rate), - the standardized model pieces (P,
y, x2, alpha, mu,
wt), - the number of observations, - and critically: -
RSS_post = the centering output RSS_Post2, -
RSS_ML (currently NA in this path).
At a conceptual level, RSS_post determines where the
dispersion-axis Gamma proposal is anchored and how the dispersion
refinement changes the proposal shape/rate parameters.
low,
upp)The implementation uses: - max_disp_perc (default 0.99)
as a central truncation mass level, - and optional
disp_lower / disp_upper overrides, to produce
the final dispersion truncation interval: [low, upp] on the
dispersion scale.
This interval becomes the domain over which all envelope components must dominate.
The function computes an anchor dispersion (often the mode of the surrogate Gamma distribution on the dispersion scale) and uses it to evaluate: - face slopes / gradient-related quantities used in the dispersion-axis refinement.
This anchor is used to extrapolate dispersion-dependent face constants to both ends of the dispersion interval.
UB_listThe dispersion-aware machinery constructs the constants required by
the accept-reject sampler: - global residual-minimum quantity
RSS_Min, - per-face UB2min, - along with the
various log/linear constants and diagnostic maxima used in the
acceptance bound.
These populate UB_list, which is later consumed by
rIndepNormalGammaReg_std to compute: - UB1,
UB2, UB3A, UB3B, - and the final
accept/reject test statistic.
gamma_list)Finally, EnvelopeDispersionBuild updates the Gamma
proposal parameters through the dispersion-axis correction logic. It
returns: - shape3 and rate2-type parameters
for the precision proposal, - plus the resulting dispersion bounds
(disp_lower, disp_upper), and a collection of
diagnostics fields.
The main return object is a list with: - Env_out (the
updated envelope with dispersion-aware PLSD / mixture structure), -
UB_list (accept-reject constants and UB diagnostics), -
gamma_list (Gamma proposal tilt params), - and additional
diagnostics (e.g., anchor value, computed maxima, UB minima).
This entire structure is consumed by the sampler in
rIndepNormalGammaReg_std.
EnvelopeSort() (component reordering)EnvelopeOrchestrator then calls the R function
EnvelopeSort(...) to reorder components based on
probability mass in a way that improves simulation runtime.
The sorted envelope and associated UB constants are carried forward into the accept-reject simulation.
rIndepNormalGammaReg_std -
Serial Accept-Reject LoopAfter EnvelopeOrchestrator() returns the sorted envelope
together with the dispersion proposal parameters and the accept-reject
constants, rIndepNormalGammaReg_std generates iid posterior
draws using an envelope-based rejection sampler.
Conceptually, each accepted draw consists of two pieces:
beta proposed from an
envelope-restricted multivariate normal (face-wise), anddispersion (via a truncated
inverse-Gamma proposal).The accept-reject decision is then made by checking that a bound
constructed from UB_list and envelope-derived tangency
quantities dominates the target density.
Inv_f3_precompute_disp)Before the per-draw loop, the sampler builds a cache once:
This cache is reused inside the accept-reject loop so that repeated evaluations at dispersion-dependent tangency points avoid rebuilding expensive matrices.
For each draw index i, the sampler repeats a
while loop until the proposal is accepted:
J from the discrete distribution defined
by PLSD.beta on the
restricted interval
j = 1,...,p, sample from the
restricted normal using the envelope bounds loglt[J, ] and
logrt[J, ].-cbars(J, j).dispersion from the truncated proposal using:
rinvgamma_ct_safe(shape3, rate2, disp_upper, disp_lower).dispersion, recompute the
dispersion-adjusted tangency point(s) by calling:
Inv_f3_with_disp(cache, dispersion, cbars_small).LL_New2),
andbeta (stored as LL_Test).
The likelihood evaluations use weights adjusted as
wt2 = wt / dispersion.With the proposal (beta, dispersion) and the chosen face
J, the sampler computes the acceptance bound via four
non-negative surplus terms:
UB1: tangent-plane style surplus using
LL_New2 and the dispersion-adjusted anchoring
(cbars and tangency points).UB2: RSS-based surplus derived from
rss_face_at_disp(...) and the global/per-face RSS minima
RSS_Min and UB2min[J].UB3A: face-specific surplus term involving
lg_prob_factor[J] and the function
g1_face_at_disp(...).UB3B: dispersion-axis correction term involving the
log-tilt components and precomputed maxima.Implementation-wise, it forms:
test1 = LL_Test - UB1
test = test1 - (UB2 + UB3A + UB3B)
test = test - log_U2 (where U2 ~ Uniform(0,1))
accept if test >= 0
Equivalently, since log_U2 = log(U2):
accept iff log(U2) <= (LL_Test - UB1) - (UB2 + UB3A + UB3B).
The implementation also enforces sign constraints expected from a valid dominating envelope:
test1 <= 0,UB2 >= 0 (with extra tolerance/diagnostics when
UB2 is numerically tiny),UB3A >= 0 and UB3B >= 0.If any sign constraint is violated, the code throws a runtime error with context to help diagnose invalid envelope constants.
Once accepted:
The fields below are the ones from the envelope/UB structures that
directly drive the sampler: - PLSD: selects the face index
J; - loglt / logrt and
cbars: define the restricted normal proposal for
beta (and the face-centering); - shape3,
rate2, disp_lower, disp_upper:
define the dispersion proposal; - RSS_Min,
UB2min, lg_prob_factor, lmc1,
lmc2, and related precomputed maxima: define
UB1-UB3B in the acceptance test.
beta_out[i, ] stores the accepted coefficients,disp_out[i] stores the accepted dispersion draw,iters_out[i] records how many rejection attempts were
needed for that accepted draw.EnvelopeOrchestrator Outputs to the
SamplerThis section is a “developer dataflow map” connecting
EnvelopeOrchestrator / EnvelopeDispersionBuild
outputs to the serial accept-reject sampler.
EnvThe sampler reads from Env: - PLSD (mixture
component selection weights), - loglt / logrt
(restricted normal coordinate bounds), - logP /
LLconst and related constants used internally, -
cbars (face gradients used to center restricted
normals).
gamma_list /
UB_listThe sampler reads from:
gamma_list:
shape3 / rate2 parameters,disp_lower / disp_upper truncation
parameters on the dispersion proposal.UB_list:
RSS_Min,UB2min,UB3A and
UB3B,This is the key interface that ensures the dispersion-aware envelope dominates the target on the allowed truncation domain.
Key arguments affecting the internal workflow: - n
(number of iid draws) - affects how often sorting / envelope sizing is
amortized, - Gridtype and n_envopt - controls
envelope grid complexity, - use_parallel and
use_opencl - controls whether the sampling and/or envelope
evaluation is parallelized / accelerated, - max_disp_perc
and optional disp_lower / disp_upper -
controls truncation interval tightness.
iters_outThe vector iters_out captures the number of
accept-reject attempts per accepted draw. Large values indicate a loose
envelope (or expensive bounds), while values near 1 suggest a tight
envelope and efficient acceptance.
For runnable examples that exercise the envelope construction and
sampler path, see: - Chapter A07 (independent Normal-Gamma extension
theory), - Chapter A05 (fixed-dispersion envelope build), - and the
inst/examples/ directory for end-to-end usage patterns.
EnvelopeBuild) and standardization
(glmb_Standardize_Model).