Chapter A06: Accept–Reject Sampling for Dispersion in Gamma Regression

Kjell Nygren

2026-04-30

1. Introduction

This vignette explains the derivation of the posterior for the dispersion parameter in a Gamma GLM with log link, and the construction of a proposal distribution and bounding function used for rejection sampling (Robert and Casella 2004; Devroye 1986) in rGamma_reg.

We consider a Gamma GLM with log link, where the mean is

\[ \mu_i = \exp(x_i^\top \beta + \alpha_i), \]

and the dispersion parameter is \(\phi > 0\).
It is convenient to work with the precision

\[ v = \frac{1}{\phi}. \]

Throughout this chapter, we treat the regression coefficients \(\beta\) and offsets \(\alpha_i\) as fixed and known.

1.2 Likelihood

We use the standard Gamma parameterization (McCullagh and Nelder 1989; Agresti 2015).

\[ Y_i \mid v \sim \mathrm{Gamma}(\text{shape}=v,\ \text{rate}=v/\mu_i), \]

so that

\[ \mathbb{E}(Y_i)=\mu_i, \qquad \mathrm{Var}(Y_i)=\frac{\mu_i^2}{v}. \]

The log‑likelihood contribution of observation \(i\) is

\[ \ell_i(v) = v\log v - v\log \mu_i + (v-1)\log y_i - \log\Gamma(v) - \frac{v}{\mu_i} y_i. \]

With weights \(w_i\), the total log‑likelihood is

\[ \ell(v) = \sum_{i=1}^n w_i \left[ v\log v - v\log \mu_i + (v-1)\log y_i - \log\Gamma(v) - \frac{v}{\mu_i} y_i \right]. \]

1.3 Prior

We place a Gamma prior on the precision:

\[ v \sim \mathrm{Gamma}(a_0, b_0), \qquad \log p(v) = a_0 \log b_0 \;-\; \log\Gamma(a_0) \;+\; (a_0 - 1)\log v \;-\; b_0 v. \]

1.4. Posterior Log‑Density

The posterior log‑density is

\[ f(v) = \ell(v) + \log p(v) \;-\; \log (C_{1}) \]

where \(C_{1}\) is a normalizing constant for the posterior density.

2. The sampler

To build effective envelopes for the Gamma family with a Gamma prior for the precision, it is important to be able to center the proposal close to the center of the posterior. To enable this, we introduce the following proposal distribution and associated bounding function and establish key claims (Robert and Casella 2004).

2.1 Proposal Distribution

Let \(\bar{c}(v_{\text{tangent}})\) denote the gradient of \(-\ell(v)\) evaluated at the tangency point: \[ \bar{c}(v_{\text{tangent}}) = -\ell'(v_{\text{tangent}}). \]

We then define a gamma proposal distribution by:

\[ v \sim \mathrm{Gamma}_{\text{prop}}\!\left(a_0 + \tfrac{1}{2}\sum_i w_i,\; b_0 - \bar{c}(v_{\text{tangent}})\right). \]

where the log of the proposal distribution specifically takes the form

\[ \begin{aligned} \log p_{prop}(v) & = (a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i ) \\[4pt] & \;+\; ((a_0+ \frac{1}{2}\sum_i w_i ) - 1)\log v \;-\; [ b_0 - \bar{c}(v_{\text{tangent}})] v. \end{aligned} \]

In the implementation, this proposal is used in truncated form on the interval \([v_{\min}, v_{\max}]\), with \(v_{\min}, v_{\max}\) derived from a curvature-based Gamma surrogate and optional dispersion bounds.

2.2 Bounding function

Let \(v_{\min}\) be lower bound for the precision \(v\). We can then define a bounding function \(\log h(\cdot)\) by

\[ \log h(v) = \ell(v) - \bigl[\ell(v_{\text{tangent}}) - \bar{c}(v_{\text{tangent}})(v - v_{\text{tangent}})\bigr] - \frac{1}{2}\Bigl(\sum_i w_i\Bigr)\log\!\left(\frac{v}{v_{\min}}\right). \]

It follows from the concavity of \(\ell(\cdot)\) and the fact that \(\log(v/v_{\min}) \ge 0\) for \(v \ge v_{\min}\) that \(\log h(v) \le 0\) for all \(v \ge v_{\min}\).

2.3 Equivalence Claim

Throughout this section, all densities are understood on the truncated domain \(v \in [v_{\min}, v_{\max}]\), with normalizing constants absorbed into \(C_1\) and \(C_2\).

Claim 1:

The posterior log density \(f(.)\) satisfies the property that

\[ f(v)=\log p_{prop}(v)+\log h(v) + C_{2} \]

where \[\begin{aligned} C_{2} & = \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i ) \log\!\left(b_0 - \bar{c}(v_{\text{tangent}})\right) \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \\[4pt] & +[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ]\\[4pt] & - [\frac{1}{2}\sum_i w_i] \log v_{\text{min}}\Bigr)\end{aligned} \].

Proof:

\[ \begin{aligned} f(v) & = \log p(v) +\ell(v) - \log (C_{1}) \\[4pt] & = \Bigl( [a_0 \log b_0 \;-\; \log\Gamma(a_0) \;+\; (a_0 - 1)\log v \;-\; b_0 v] \\[4pt] & + [[\frac{1}{2}\sum_i w_i] \log v - [-\bar{c}(v_{\text{tangent}}) v] \\[4pt] & + [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & - [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \Bigr) \\[4pt] & + \Bigl(\ell(v) - [[\frac{1}{2}\sum_i w_i] \log v -[- \bar{c}(v_{\text{tangent}}) v]] \Bigr) \\[4pt] & + \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \Bigr) \\[4pt] & = \Bigl( [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + ((a_0+ \frac{1}{2}\sum_i w_i ) - 1)\log v \;-\; [ b_0 - \bar{c}(v_{\text{tangent}})]v \Bigr) \\[4pt] & + \Bigl(\ell(v) - [[\frac{1}{2}\sum_i w_i] \log v -[- \bar{c}(v_{\text{tangent}}) v]] \\[4pt] & -[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ] \\[4pt] & + [\frac{1}{2}\sum_i w_i] \log v_{\text{min}} \Bigr) \\[4pt] & + \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \\[4pt] & +[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ]\\[4pt] & - [\frac{1}{2}\sum_i w_i] \log v_{\text{min}} \Bigr) \\[4pt] & = \log p_{prop}(v) \\[4pt] & + \Bigl([\ell(v)-[\ell(v_{\text{tangent}})-\bar{c}(v_{\text{tangent}})(v - v_{\text{tangent}})]] \\[4pt] & - \frac{1}{2}\Bigl(\sum_i w_i\Bigr)\log\!\left(\frac{v}{v_{\min}}\right) \Bigr) \\[4pt] & + \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \\[4pt] & +[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ]\\[4pt] & - [\frac{1}{2}\sum_i w_i] \log v_{\text{min}}\Bigr) \\[4pt] & = \log p_{prop}(v) + \log h(v) + C_{2} \end{aligned} \]

2.4 Implementation Outline

To implement the sampler, we proceed as follows:

  1. Find the posterior mode \(v_\star\) using a Gamma‑surrogate fixed‑point iteration.
  2. Approximate the posterior mean \(v_{\text{mean}}\) using curvature at \(v_\star\) and a Gamma surrogate \(\Gamma(\alpha_{\!bar}, \beta_{\!bar})\).
  3. Set the tangency point to \[ v_{\text{tangent}} = v_{\text{mean}}. \]
  4. Build the proposal and bounding function using \(v_{\text{tangent}}\) as above.
  5. Sample as follows:
      1. Generate a candidate \(v_{\text{cand}}\) from the (possibly truncated) proposal \(\mathrm{Gamma}(\text{shape}_\text{prop}, \text{rate}_\text{prop})\).
      1. Draw \(u \sim \mathrm{Uniform}(0,1)\).
      1. Accept if \[ \log h(v_{\text{cand}}) - \log u > 0. \]

The accepted draws for \(v\) are then transformed to dispersion by \(\phi = 1/v\).

References

Agresti, Alan. 2015. Foundations of Linear and Generalized Linear Models. Cambridge University Press.
Devroye, Luc. 1986. Non-Uniform Random Variate Generation. Springer.
McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Chapman; Hall.
Robert, Christian P., and George Casella. 2004. Monte Carlo Statistical Methods. 2nd ed. Springer.