Chapter 05: Foundations of GLMs – Families, Links, and Log-Concave Likelihoods

Kjell Nygren

2026-04-30

1. Conceptual Overview

Generalized linear models (GLMs) rest on two foundational ideas:
(1) exponential family likelihoods, which provide a unified mathematical structure for a wide range of data-generating processes, and
(2) link functions, which connect the mean of the response to a linear predictor.

This section reviews the core concepts behind exponential families, explains why canonical links play such an important role, and highlights the special role of log-concavity in both classical and Bayesian estimation. Standard references for GLM structure include (McCullagh and Nelder 1989; Nelder and Wedderburn 1972); the original formulation of GLMs in S appears in (Hastie and Pregibon 1992). For Bayesian GLMs with normal priors, envelope-based iid posterior sampling in glmbayes builds on (Nygren and Nygren 2006).

1.1 Exponential Families: A Unifying Framework

Many common statistical models belong to the exponential family, a class of distributions that can be written in the weighted form

\[ f(y \mid \theta, \phi, w) = \exp\left\{ \sum_{i=1}^{n} w_i \left[ \frac{y_i \theta_i - b(\theta_i)}{a(\phi)} + c(y_i, \phi) \right] \right\}. \]

Here:

This formulation includes the Gaussian, Poisson, Binomial, Gamma, and many others.
The exponential-family form is not merely aesthetic: it guarantees several structural properties that GLMs rely on:

These properties make exponential-family models computationally stable and theoretically elegant, especially when combined with linear predictors.

1.3 Why Log-Concavity Matters

A function f is log-concave if log f is concave.
Most exponential-family likelihoods with canonical links are log-concave in the linear predictor, and often in the coefficients beta as well.

Log-concavity has several important implications:

(a) Existence of gradients and subgradients

For concave functions, the gradient exists almost everywhere, and when it does not, a subgradient always exists.
This is crucial for:

  • optimization algorithms
  • envelope construction
  • sampling methods based on tangencies
  • the likelihood-subgradient densities introduced in your JASA paper

Because GLM likelihoods are log-concave in many common cases, subgradient-based methods are guaranteed to work.

(b) Any local maximum is a global maximum

Concavity implies:

  • no spurious local optima
  • stable convergence of Newton, Fisher scoring, and IRLS
  • predictable behavior even in high dimensions

This is one of the reasons GLMs are so widely used: the optimization landscape is benign.

(c) Validity of envelope construction methods

The envelope construction approach of (Nygren and Nygren 2006) relies on:

  • log-concavity of the likelihood
  • existence of subgradients for the negative log-likelihood
  • the ability to form tight tangent-based upper bounds

For GLMs with canonical links, these conditions are naturally satisfied.
This makes GLMs an ideal setting for likelihood-subgradient densities, mixture envelopes, and accept–reject sampling strategies.

(d) Simplified Bayesian computation

Log-concave likelihoods interact especially well with:

  • normal priors
  • normal-gamma priors
  • Laplace approximations
  • adaptive rejection sampling
  • convex optimization methods
  • envelope based accept-reject sampling

Posterior modes are unique, posterior tails behave predictably, and envelope-based samplers remain efficient even in moderate dimensions.


This conceptual foundation sets the stage for the rest of the chapter:

  • Section 2 contrasts classical and Bayesian GLM workflows.
  • Section 3 details the families and links supported in glmb.
  • Section 4 shows how to specify families and links in practice.
  • Section 5 returns to log-concavity and its role in estimation and sampling.

4. Bayesian GLMs with glmb()

The function glmb() is a Bayesian extension of the classical glm() function.
Its interface mirrors glm() as closely as possible: users specify a model using a formula, choose a likelihood family, and then supply a prior distribution through the pfamily argument.
This design preserves the familiar GLM workflow while enabling full Bayesian inference.

4.1 Relationship to Classical GLMs

The setup for glmb() follows the same structure as glm():

This compatibility ensures that standard generics—summary(), predict(), residuals(), extractAIC(), and others—work naturally with glmb objects.

4.2 The pfamily Argument: Specifying Priors

The key addition in glmb() is the required pfamily argument, which specifies the prior distribution for the regression coefficients.
The pfamily system parallels how glm() uses family:

The default prior is a multivariate normal:

pfamily = dNormal(mu, Sigma)

The helper function Prior_Setup() constructs sensible defaults for mu and Sigma, using a reparameterized form of Zellner’s g-prior.
Users may also fully customize the prior.

Supported prior families include:

4.3 Supported Likelihood Families

glmb() currently supports:

These match the most commonly used GLM families.

4.4 A Direct Illustration: Calling glmb() with Formulas, Families, and Priors

Just as a classical GLM is defined by a formula and a likelihood family:

glm(counts ~ outcome + treatment,
    family = poisson(link = "log"))

a Bayesian GLM adds one additional component: the prior family.

A typical workflow uses Prior_Setup() to construct prior parameters:

ps <- Prior_Setup(counts ~ outcome + treatment,
                  family = poisson(link = "log"))
mu <- ps$mu
V  <- ps$Sigma

The corresponding Bayesian call mirrors the classical one:

glmb.D93 <- glmb(counts ~ outcome + treatment,
                 family  = poisson(link = "log"),
                 pfamily = dNormal(mu = mu, Sigma = V))

This single line shows how glmb() receives:

The result is a set of independent posterior draws for coefficients, fitted values, linear predictors, and deviance, along with posterior summaries such as the posterior mode and DIC.

4.5 Posterior Sampling

For any supported combination of likelihood family, link, and prior family, glmb() generates independent draws from the posterior distribution—no MCMC chains are required.

By default:

4.6 Returned Object

A glmb object contains:

Because glmb inherits from "glm" and "lm", most classical methods apply directly.

5. Log‑Concavity, Envelopes, and Posterior Computation

The Bayesian methods implemented in glmb() rely on the exponential‑family structure described in Sections 1–3.
This section explains, at a high level, why the posterior distribution is well‑behaved for generalized linear models and how glmb() exploits this structure to generate independent posterior draws.

5.1 Why Log‑Concavity Matters

For the likelihood families supported by glmb(), the log‑likelihood is concave in the canonical parameter.
When combined with a log‑concave prior (such as the multivariate normal used by default), the posterior density is also log‑concave.
This ensures:

Canonical links (e.g., logit for binomial, log for Poisson) preserve concavity and are therefore especially convenient.

5.2 Envelope Construction

For non‑Gaussian models, glmb() uses an accept–reject sampler based on likelihood‑subgradient envelopes.
The idea is to build a tight, convex upper bound on the negative log‑likelihood using tangent points.
This envelope:

The Gridtype and n_envopt arguments control how many tangent points are used, trading off envelope tightness against construction cost.
The component iters in the returned object reports how many candidate draws were generated before acceptance.

5.3 Posterior Computation Strategy

Posterior sampling in glmb() proceeds in two stages:

  1. Mode finding
    A classical GLM fit is used as the starting point, and the posterior mode is obtained by optimizing the sum of the log‑likelihood and log‑prior.

  2. Independent sampling

    • For Gaussian models with conjugate priors, draws come from closed‑form posterior distributions.
    • For all other supported families, the envelope‑based accept–reject sampler generates independent posterior draws.

Because the sampler produces independent draws, there are no chains, no burn‑in, and no convergence diagnostics.
This makes posterior summaries straightforward and computationally efficient.

5.4 Summary

The computational methods in glmb() leverage the structure of exponential‑family likelihoods and log‑concave priors to produce fast, reliable Bayesian inference for generalized linear models.
These methods ensure that the Bayesian extension behaves predictably across all supported families and links, while maintaining a workflow that closely parallels the classical glm() function.

References

Hastie, T. J., and D. Pregibon. 1992. Generalized Linear Models.” Chap. 6 in Statistical Models in S, edited by J. M. Chambers and T. J. Hastie. Wadsworth & Brooks/Cole.
McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Chapman; Hall.
Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society. Series A (General) 135 (3): 370–84. https://doi.org/10.2307/2344614.
Nygren, K. N., and L. M. Nygren. 2006. Likelihood Subgradient Densities.” Journal of the American Statistical Association 101 (475): 1144–56. https://doi.org/10.1198/016214506000000357.