This vignette explains the derivation of the posterior for the
dispersion parameter in a Gamma GLM with log link, and the construction
of a proposal distribution and bounding function used for rejection
sampling (Robert and Casella 2004; Devroye
1986) in rGamma_reg.
We consider a Gamma GLM with log link, where the mean is
\[ \mu_i = \exp(x_i^\top \beta + \alpha_i), \]
and the dispersion parameter is \(\phi >
0\).
It is convenient to work with the precision
\[ v = \frac{1}{\phi}. \]
Throughout this chapter, we treat the regression coefficients \(\beta\) and offsets \(\alpha_i\) as fixed and known.
We use the standard Gamma parameterization (McCullagh and Nelder 1989; Agresti 2015).
\[ Y_i \mid v \sim \mathrm{Gamma}(\text{shape}=v,\ \text{rate}=v/\mu_i), \]
so that
\[ \mathbb{E}(Y_i)=\mu_i, \qquad \mathrm{Var}(Y_i)=\frac{\mu_i^2}{v}. \]
The log‑likelihood contribution of observation \(i\) is
\[ \ell_i(v) = v\log v - v\log \mu_i + (v-1)\log y_i - \log\Gamma(v) - \frac{v}{\mu_i} y_i. \]
With weights \(w_i\), the total log‑likelihood is
\[ \ell(v) = \sum_{i=1}^n w_i \left[ v\log v - v\log \mu_i + (v-1)\log y_i - \log\Gamma(v) - \frac{v}{\mu_i} y_i \right]. \]
We place a Gamma prior on the precision:
\[ v \sim \mathrm{Gamma}(a_0, b_0), \qquad \log p(v) = a_0 \log b_0 \;-\; \log\Gamma(a_0) \;+\; (a_0 - 1)\log v \;-\; b_0 v. \]
The posterior log‑density is
\[ f(v) = \ell(v) + \log p(v) \;-\; \log (C_{1}) \]
where \(C_{1}\) is a normalizing constant for the posterior density.
To build effective envelopes for the Gamma family with a Gamma prior for the precision, it is important to be able to center the proposal close to the center of the posterior. To enable this, we introduce the following proposal distribution and associated bounding function and establish key claims (Robert and Casella 2004).
Let \(\bar{c}(v_{\text{tangent}})\) denote the gradient of \(-\ell(v)\) evaluated at the tangency point: \[ \bar{c}(v_{\text{tangent}}) = -\ell'(v_{\text{tangent}}). \]
We then define a gamma proposal distribution by:
\[ v \sim \mathrm{Gamma}_{\text{prop}}\!\left(a_0 + \tfrac{1}{2}\sum_i w_i,\; b_0 - \bar{c}(v_{\text{tangent}})\right). \]
where the log of the proposal distribution specifically takes the form
\[ \begin{aligned} \log p_{prop}(v) & = (a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i ) \\[4pt] & \;+\; ((a_0+ \frac{1}{2}\sum_i w_i ) - 1)\log v \;-\; [ b_0 - \bar{c}(v_{\text{tangent}})] v. \end{aligned} \] —
In the implementation, this proposal is used in truncated form on the interval \([v_{\min}, v_{\max}]\), with \(v_{\min}, v_{\max}\) derived from a curvature-based Gamma surrogate and optional dispersion bounds.
Let \(v_{\min}\) be lower bound for the precision \(v\). We can then define a bounding function \(\log h(\cdot)\) by
\[ \log h(v) = \ell(v) - \bigl[\ell(v_{\text{tangent}}) - \bar{c}(v_{\text{tangent}})(v - v_{\text{tangent}})\bigr] - \frac{1}{2}\Bigl(\sum_i w_i\Bigr)\log\!\left(\frac{v}{v_{\min}}\right). \]
It follows from the concavity of \(\ell(\cdot)\) and the fact that \(\log(v/v_{\min}) \ge 0\) for \(v \ge v_{\min}\) that \(\log h(v) \le 0\) for all \(v \ge v_{\min}\).
Throughout this section, all densities are understood on the truncated domain \(v \in [v_{\min}, v_{\max}]\), with normalizing constants absorbed into \(C_1\) and \(C_2\).
Claim 1:
The posterior log density \(f(.)\) satisfies the property that
\[ f(v)=\log p_{prop}(v)+\log h(v) + C_{2} \]
where \[\begin{aligned} C_{2} & = \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i ) \log\!\left(b_0 - \bar{c}(v_{\text{tangent}})\right) \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \\[4pt] & +[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ]\\[4pt] & - [\frac{1}{2}\sum_i w_i] \log v_{\text{min}}\Bigr)\end{aligned} \].
Proof:
\[ \begin{aligned} f(v) & = \log p(v) +\ell(v) - \log (C_{1}) \\[4pt] & = \Bigl( [a_0 \log b_0 \;-\; \log\Gamma(a_0) \;+\; (a_0 - 1)\log v \;-\; b_0 v] \\[4pt] & + [[\frac{1}{2}\sum_i w_i] \log v - [-\bar{c}(v_{\text{tangent}}) v] \\[4pt] & + [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & - [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \Bigr) \\[4pt] & + \Bigl(\ell(v) - [[\frac{1}{2}\sum_i w_i] \log v -[- \bar{c}(v_{\text{tangent}}) v]] \Bigr) \\[4pt] & + \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \Bigr) \\[4pt] & = \Bigl( [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + ((a_0+ \frac{1}{2}\sum_i w_i ) - 1)\log v \;-\; [ b_0 - \bar{c}(v_{\text{tangent}})]v \Bigr) \\[4pt] & + \Bigl(\ell(v) - [[\frac{1}{2}\sum_i w_i] \log v -[- \bar{c}(v_{\text{tangent}}) v]] \\[4pt] & -[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ] \\[4pt] & + [\frac{1}{2}\sum_i w_i] \log v_{\text{min}} \Bigr) \\[4pt] & + \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \\[4pt] & +[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ]\\[4pt] & - [\frac{1}{2}\sum_i w_i] \log v_{\text{min}} \Bigr) \\[4pt] & = \log p_{prop}(v) \\[4pt] & + \Bigl([\ell(v)-[\ell(v_{\text{tangent}})-\bar{c}(v_{\text{tangent}})(v - v_{\text{tangent}})]] \\[4pt] & - \frac{1}{2}\Bigl(\sum_i w_i\Bigr)\log\!\left(\frac{v}{v_{\min}}\right) \Bigr) \\[4pt] & + \Bigl( - \log (C_{1}) \\[4pt] & - [(a_0+ \frac{1}{2}\sum_i w_i )\log [ b_0 - \bar{c}(v_{\text{tangent}})] \;-\; \log\Gamma(a_0+ \frac{1}{2}\sum_i w_i )] \\[4pt] & + [a_0 \log b_0 \;-\; \log\Gamma(a_0) ] \\[4pt] & +[\ell(v_{\text{tangent}})+\bar{c}(v_{\text{tangent}})v_{\text{tangent}} ]\\[4pt] & - [\frac{1}{2}\sum_i w_i] \log v_{\text{min}}\Bigr) \\[4pt] & = \log p_{prop}(v) + \log h(v) + C_{2} \end{aligned} \]
To implement the sampler, we proceed as follows:
The accepted draws for \(v\) are then transformed to dispersion by \(\phi = 1/v\).