The glmb() family of functions provides fully Bayesian analogues to glm() and its standard methods (McCullagh and Nelder 1989; Venables and Ripley 2002). This vignette demonstrates how to set up and call glmb(), compares its syntax and output to glm(), and highlights the additional arguments required for prior specification. A specialized lmb() function for Bayesian linear regression is bundled in the package and will be explored in Chapter 2. For Bayesian regression perspective, see (Gelman et al. 2013). Separate vignettes cover all complementary methods for both lmb() and glmb().
We begin by reproducing the randomized controlled trial example of (Dobson 1990), which is the standard demo for glm(). Comparing glmb() output to this familiar dataset will make parallels clear. The data frame contains treatment groups, outcomes, and observed counts:
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
#> treatment outcome counts
#> 1 1 1 18
#> 2 1 2 17
#> 3 1 3 15
#> 4 2 1 20
#> 5 2 2 10
#> 6 2 3 20
#> 7 3 1 25
#> 8 3 2 13
#> 9 3 3 12The example code for the glm function specifies a Poisson regression model for this data. The below code chunks shows how the glmb function can be used in much the same fashion but with some extra requirements. In particular, we need to provide a prior distribution (in this case a Multivariate Normal prior) and related constants (in this case, a prior Mean and and a prior Variance-Covariance matrix) for the coefficients of interest. Optionally, we can also tell the function how many draws to make from the posterior distribution (the default used here generates 1000 iid draws).
The base glm() call takes two main arguments:
We cover families, link functions, and their glmb() implementations in detail in another vignette.
Before invoking glmb(), we must define a prior. The package supplies a helper, Prior_Setup(), which returns appropriately sized default prior parameters and variable names.
Use Prior_Setup() to generate default prior hyperparameters and extract the design matrix variable names:
With the prior defined, the glmb() call mirrors glm(), augmented by the pfamily argument:
# Step 3: Call the glmb function
glmb.D93<-glmb(counts ~ outcome + treatment, family=poisson(), pfamily=dNormal(mu=mu,Sigma=V))Because Prior_Setup() aligns the dimensions of mu and Sigma with the design matrix, you can safely modify them to impose custom shrinkage on selected coefficients
Both glm() and glmb() provide concise print() methods. Below we compare their default printed summaries.
## Printed view of the output from the glm function
print(glm.D93)
#>
#> Call: glm(formula = counts ~ outcome + treatment, family = poisson())
#>
#> Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.045e+00 -4.543e-01 -2.930e-01 1.218e-15 8.438e-16
#>
#> Degrees of Freedom: 8 Total (i.e. Null); 4 Residual
#> Null Deviance: 10.58
#> Residual Deviance: 5.129 AIC: 56.76The Coefficients table reports the maximum‐likelihood estimates for
the intercept and each predictor on the log‐scale, summarizing how each
covariate shifts the expected log‐count from its baseline. Later
vignettes will show how these MLEs compare to the Bayesian posterior
means produced by glmb().
The Degrees of Freedom line gives the total df for the null (intercept‐only) model and the residual df for the full model (observations minus estimated parameters). These dfs underpin chi‐square tests on deviance reductions and the correct calibration of standard errors.
Residual deviance quantifies the lack of fit of the fitted model relative to a perfect (saturated) model, with smaller values indicating better fit. In later sections we’ll use both null and residual deviance to assess model adequacy and perform formal likelihood‐ratio tests.
The Akaike Information Criterion (AIC) combines model fit with a penalty for complexity by adding twice the number of parameters to the deviance. Models with lower AIC are preferred, guiding model selection in both classical GLM workflows and their Bayesian analogues.
## Printed view of the output from the glmb function
print(glmb.D93)
#>
#> Call: glmb(formula = counts ~ outcome + treatment, family = poisson(),
#> pfamily = dNormal(mu = mu, Sigma = V))
#>
#> Posterior Mean Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.021937 -0.464503 -0.296493 0.014353 0.006985
#>
#> Effective Number of Parameters: 5.242052
#> Expected Residual Deviance: 10.43431
#> DIC: 57.30854The Posterior Mean Coefficients table reports the Bayesian point estimates for each parameter under the specified prior, summarizing how each covariate shifts the expected log‐count after combining data and prior information. Later vignettes will explore how these posterior means compare to classical MLEs and how prior choice induces shrinkage.
The Effective Number of Parameters quantifies model complexity by measuring how much the posterior distribution adapts to the data, often shrinking the apparent degrees of freedom under informative priors. We’ll contrast this with the nominal parameter count and show its role in penalizing overfitting.
Expected Residual Deviance is the posterior mean of the deviance, reflecting the model’s average lack of fit under its own predictive distribution. In subsequent vignettes we’ll use this quantity for Bayesian goodness‐of‐fit checks and compare it to classical residual deviance.
The Deviance Information Criterion (DIC) combines expected residual deviance with a penalty for effective complexity, balancing fit and parsimony in a single metric. Models with lower DIC are preferred, and later sections will delve into its computation and limitations.
The section references in the table below follow Agresti (2015).
| Vignette Section | Textbook Chapter & Section | Notes (provisional) |
|---|---|---|
| 3.1.1 Calling the glm function | 7.1 Poisson GLMs for Counts and Rates | Illustrates the canonical log-link for count data, Example 7.1 on log-rate modeling and interpretation via rate ratios. |
| 3.2.1 Setting the prior | 10.2 Bayesian Linear Models | Discusses conjugate normal priors and the g-prior; shows how \(V \propto (X^\top X)^{-1}\) standardizes shrinkage across coefficients (Example 10.2). |
| 3.2.2 Calling the glmb function | 10.3 Bayesian Generalized Linear Models | Introduces Bayesian GLM framework, Laplace approximation for marginal likelihood, and basic MCMC/penalized-likelihood connections (Example 11.3). |
| 4.1.1 Coefficients | 7.1 Poisson GLMs for Counts and Rates | Details how to interpret coefficients on the log scale as multiplicative effects on the mean; see Example 8.4’s discussion of log-rate differences. |
| 4.1.2 Null and Residual Degrees of Freedom | 4.4 Deviance of a GLM, Model Comparison, and Model Checking | Defines null vs. residual deviance, links to chi-square tests for nested models, and outlines df calculations for goodness-of-fit testing. |
| 4.1.3 Residual Deviance | 4.4 Deviance of a GLM, Model Comparison, and Model Checking | Covers deviance residuals, Pearson residuals, and formal goodness-of-fit metrics based on the deviance. |
| 4.1.4 AIC | 4.6 Selecting Explanatory Variables for a GLM | Presents AIC derivation \(\!D + 2p\), compares to BIC, and discusses selection of parsimonious models via information criteria. |
| 4.2.1 Posterior Mean Coefficients | 10.2 Bayesian Linear Models | Shows how posterior means serve as point estimates; illustrates shrinkage toward zero under informative priors (Example 11.5). |
| 4.2.2 Effective Number of Parameters | 10.3 Bayesian Generalized Linear Models | Defines effective parameter count \(p_D = \bar D - D(\bar\theta)\); discusses its role in penalizing model complexity. |
| 4.2.3 Expected Residual Deviance | 10.3 Bayesian Generalized Linear Models | Explains computation of \(\bar D\), the posterior-mean deviance, and its use in posterior predictive checks (see that chapter). |
| 4.2.4 DIC | 10.3 Bayesian Generalized Linear Models | Derives DIC \(\!= \bar D + p_D\); compares DIC to AIC/BIC and highlights limitations when priors are highly informative. |