Chapter A03: Methods available in glmbayes

The glmb function and related method functions that handle the output are designed to be Bayesian versions of the glm function and many of its method functions. This vignette shows how the basic setup/calling of the functions compare and then walks through how the method functions for glmb can be called to generate similar outputs to those from the glm functions. GLM background is standard (McCullagh and Nelder 1989; Venables and Ripley 2002); the glmb interface parallels glm as described in (Hastie and Pregibon 1992).

To understand how the outputs of the glmb function mirrors those for the glm function, it is useful to take a look at the first portion of the example that is provided with the glm function. The data are randomized controlled trial counts from (Dobson 1990). Here is a view of the data:

## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
#>   treatment outcome counts
#> 1         1       1     18
#> 2         1       2     17
#> 3         1       3     15
#> 4         2       1     20
#> 5         2       2     10
#> 6         2       3     20
#> 7         3       1     25
#> 8         3       2     13
#> 9         3       3     12

The example code for the glm function specifies a Poisson regression model for this data. The below code chunks shows how the glmb function can be used in much the same fashion but with some extra requirements. In particular, we need to provide a prior distribution (in this case a Multivariate Normal prior) and related constants (in this case, a prior Mean and and a prior Variance-Covariance matrix) for the coefficients of interest. Optionally, we can also tell the function how many draws to make from the posterior distribution (the default used here generates 1000 iid draws).

## Call to glm
glm.D93 <- glm(counts ~ outcome + treatment, 
              family = poisson())

To use the glmb function, we first use a call to a function Prior_Setup to initialize the prior and to get the variable names needed. We use the default prior from the Prior_Setup function (discussed further in a separate vignette).

## Using glmb
## Step 1: Set up Default Prior 
ps=Prior_Setup(counts ~ outcome + treatment,family=poisson())
#> Using default pwt = 0.01 (low-d default).
mu=ps$mu
V=ps$Sigma

We now call the glmb function and include the default prior distribution in addition to the two required arguments for the glm function.

# Step 3: Call the glmb function
glmb.D93<-glmb(counts ~ outcome + treatment, family=poisson(), pfamily=dNormal(mu=mu,Sigma=V))

In the above, it is worth noting the additional step that we went through when specifying the prior. We used a call to the function Prior_Setup to get the correct dimensions for the mean and variance-covariance matrices and to initialize the constants. The Prior_Setup function also provided information on the Variable names in the design matrix (which also corresponds to the names of the coefficients eventually estimated) so we can make informed changes to the prior if so desired.

The next couple of vignette’s will discuss the prior specification in more details. Here we instead focus on reviewing the output from the function and how it can be used.

Taking a look at the basic printed output, we can see that the two closely mirror each other with the glmb posterior means replacing the glm maximum likelihood estimates.

## Printed view of the output from the glm function 
print(glm.D93)
#> 
#> Call:  glm(formula = counts ~ outcome + treatment, family = poisson())
#> 
#> Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>   3.045e+00   -4.543e-01   -2.930e-01    1.218e-15    8.438e-16  
#> 
#> Degrees of Freedom: 8 Total (i.e. Null);  4 Residual
#> Null Deviance:       10.58 
#> Residual Deviance: 5.129     AIC: 56.76
## Printed view of the output from the glmb function 
print(glmb.D93)
#> 
#> Call:  glmb(formula = counts ~ outcome + treatment, family = poisson(), 
#>     pfamily = dNormal(mu = mu, Sigma = V))
#> 
#> Posterior Mean Coefficients:
#> (Intercept)     outcome2     outcome3   treatment2   treatment3  
#>    3.027907    -0.443521    -0.286392    -0.005008    -0.004942  
#> 
#> Effective Number of Parameters: 4.865805 
#> Expected Residual Deviance: 10.03232 
#> DIC: 56.5303

In addition to the posterior means, the glmb printed output also returned three pieces of information that are similar to (but not quite the same) as the classical output. The “Effective Number of Parameters” should in general be close to (but not exactly the same) as the number of parameters estimated while the Expected Residual Deviance in general will be higher than the corresponding maximum likelihood estimate for the Residual Deviance (since the latter is designed to minimize it). Finally, the DIC is essentially a Bayesian version of the AIC. These measures will be discussed in greater detail in one of our later vignettes.

In addition to the basic print function output, the glmb function returns an object with an assigned class “glmb” for which a number of generic functions (or methods) are available. The class “glmb” inherits from “glm” and “lm” and as such many functions for those classes work directly for “glmb”. For some of the instances where the inherited methods fail and/or could produce incorrect results, we have implemented methods specifically for the glmb class. The methods for classes lm, glm, and glmb are listed below. We will use many of these later in this vignette.

## Methods for class "lm"
methods(class="lm")
#>  [1] add1           addterm        alias          anova          boxcox        
#>  [6] case.names     coerce         confint        cooks.distance deviance      
#> [11] dfbeta         dfbetas        dffits         drop1          dropterm      
#> [16] dummy.coef     effects        extractAIC     family         formula       
#> [21] hatvalues      influence      initialize     kappa          labels        
#> [26] logLik         logtrans       model.frame    model.matrix   nobs          
#> [31] plot           predict        print          proj           qr            
#> [36] residuals      rstandard      rstudent       show           simulate      
#> [41] slotsFromS3    summary        variable.names vcov          
#> see '?methods' for accessing help and source code

## Methods for class "glm"
methods(class="glm")
#>  [1] add1           addterm        anova          coerce         confint       
#>  [6] cooks.distance deviance       dfbetas        dffits         drop1         
#> [11] dropterm       effects        extractAIC     family         formula       
#> [16] gamma.shape    influence      initialize     logLik         model.frame   
#> [21] nobs           predict        print          profile        residuals     
#> [26] rstandard      rstudent       show           sigma          slotsFromS3   
#> [31] summary        vcov           weights       
#> see '?methods' for accessing help and source code

## Methods for class "glmb"
methods(class="glmb")
#>  [1] anova          case.names     confint        cooks.distance dfbetas       
#>  [6] dummy.coef     extractAIC     influence      logLik         plot          
#> [11] predict        print          residuals      rstandard      rstudent      
#> [16] simulate       summary        variable.names vcov          
#> see '?methods' for accessing help and source code

In turn, we see the Call, a list of Deviance residuals, and the estimated coefficients. The coefficients are then followed by some additional model related information.

## summary output for the "glm" class
summary(glm.D93)
#> 
#> Call:
#> glm(formula = counts ~ outcome + treatment, family = poisson())
#> 
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  3.045e+00  1.709e-01  17.815   <2e-16 ***
#> outcome2    -4.543e-01  2.022e-01  -2.247   0.0246 *  
#> outcome3    -2.930e-01  1.927e-01  -1.520   0.1285    
#> treatment2   1.217e-15  2.000e-01   0.000   1.0000    
#> treatment3   8.438e-16  2.000e-01   0.000   1.0000    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 10.5814  on 8  degrees of freedom
#> Residual deviance:  5.1291  on 4  degrees of freedom
#> AIC: 56.761
#> 
#> Number of Fisher Scoring iterations: 4

The summary for the glmb function follow a similar structure but adds a table containing information related to the prior and the maximum likelihood above the table with the means for the estimated Bayesian coefficients. The output below the main table with coefficients is also modified to contain similar (but slightly different) pieces of information (the details of which are discussed elsewhere).

## summary output for the "glm" class
summary(glmb.D93)
#> Call
#> glmb(formula = counts ~ outcome + treatment, family = poisson(), 
#>     pfamily = dNormal(mu = mu, Sigma = V))
#> 
#> Expected Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -0.9402467 -0.6185983 -0.1431006  0.9254662  1.0880032 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>              Null Mode Prior Mean   Prior.sd  Max Like. Like.sd
#> (Intercept)  2.813e+00  2.813e+00  1.700e+00  3.045e+00   0.171
#> outcome2     0.000e+00  0.000e+00  2.012e+00 -4.543e-01   0.202
#> outcome3     0.000e+00  0.000e+00  1.918e+00 -2.930e-01   0.193
#> treatment2   0.000e+00  0.000e+00  1.990e+00  1.217e-15   0.200
#> treatment3   0.000e+00  0.000e+00  1.990e+00  8.438e-16   0.200
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>              Post.Mode  Post.Mean    Post.Sd   MC Error Pr(Null_tail) SE(tail)
#> (Intercept)  3.042e+00  3.028e+00  1.685e-01  5.327e-03     1.010e-01    0.010
#> outcome2    -4.497e-01 -4.435e-01  1.998e-01  6.319e-03     1.400e-02    0.004
#> outcome3    -2.901e-01 -2.864e-01  1.909e-01  6.036e-03     5.800e-02    0.007
#> treatment2  -1.395e-05 -5.008e-03  1.991e-01  6.296e-03     4.830e-01    0.016
#> treatment3   1.209e-06 -4.942e-03  2.064e-01  6.526e-03     4.800e-01    0.016
#>             Pr(Prior_tail)  
#> (Intercept)          0.101  
#> outcome2             0.014 *
#> outcome3             0.058 .
#> treatment2           0.483  
#> treatment3           0.480  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance  2.2993   2.2993
#>      Tail Probability  0.0100   0.0100
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                  1.0%      2.5%      5.0%    Median     95.0%     97.5% 99.0%
#> (Intercept)  2.601929  2.698662  2.753420  3.025358  3.287703  3.348876 3.418
#> outcome2    -0.885851 -0.813247 -0.777302 -0.440005 -0.084873 -0.041754 0.012
#> outcome3    -0.757835 -0.663464 -0.603148 -0.286946  0.017116  0.087008 0.153
#> treatment2  -0.485779 -0.404673 -0.341493 -0.005840  0.320885  0.371684 0.451
#> treatment3  -0.447433 -0.393536 -0.346218 -0.007991  0.334258  0.403172 0.491
#> 
#> Effective Number of Parameters: 4.865805 
#> Expected Residual Deviance: 10.03232 
#> DIC: 56.5303 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.839

Model Fit, Predictions, Deviance Residuals, Covariance Matrices, and Confidence/Credible Intervals

Let’s next take a look at the outputs from these additional methods to see how they compare. Note that the Bayesian version of these contain random draws tied to the underlying distributions so the column means are mostly used in these comparisons.

## fitted outputs for the glm function
fitted(glm.D93)
#>        1        2        3        4        5        6        7        8 
#> 21.00000 13.33333 15.66667 21.00000 13.33333 15.66667 21.00000 13.33333 
#>        9 
#> 15.66667

## mean of fitted outputs for the glm function
## works without a "glmb" class specific generic function
colMeans(fitted(glmb.D93))
#>        1        2        3        4        5        6        7        8 
#> 20.94561 13.50086 15.77634 20.83699 13.43617 15.70752 20.86736 13.43200 
#>        9 
#> 15.69822

## predictions for the glm function
predict(glm.D93)
#>        1        2        3        4        5        6        7        8 
#> 3.044522 2.590267 2.751535 3.044522 2.590267 2.751535 3.044522 2.590267 
#>        9 
#> 2.751535

## predictions for the glmb function
colMeans(glmb.D93$linear.predictors) # no current predict function
#>        1        2        3        4        5        6        7        8 
#> 3.027907 2.584385 2.741514 3.022899 2.579378 2.736507 3.022965 2.579443 
#>        9 
#> 2.736572
colMeans(predict(glmb.D93)) 
#>        1        2        3        4        5        6        7        8 
#> 3.027907 2.584385 2.741514 3.022899 2.579378 2.736507 3.022965 2.579443 
#>        9 
#> 2.736572

## residuals for the glm function
residuals(glm.D93)
#>           1           2           3           4           5           6 
#> -0.67124923  0.96272360 -0.16964662 -0.21998507 -0.95552353  1.04938637 
#>           7           8           9 
#>  0.84715368 -0.09167147 -0.96656372

## residuals for the glmb function
colMeans(residuals(glmb.D93))
#>          1          2          3          4          5          6          7 
#> -0.6185983  0.9626354 -0.1527931 -0.1431006 -0.9402467  1.0880032  0.9254662 
#>          8          9 
#> -0.0744562 -0.9323209

## vcov for the glm function
vcov(glm.D93)
#>             (Intercept)      outcome2      outcome3    treatment2    treatment3
#> (Intercept)  0.02920635 -1.587302e-02 -1.587302e-02 -2.000000e-02 -2.000000e-02
#> outcome2    -0.01587302  4.087301e-02  1.587302e-02 -7.889764e-18 -7.435533e-18
#> outcome3    -0.01587302  1.587302e-02  3.714961e-02 -5.991636e-18 -6.847584e-18
#> treatment2  -0.02000000 -7.889764e-18 -5.991636e-18  4.000000e-02  2.000000e-02
#> treatment3  -0.02000000 -7.435533e-18 -6.847584e-18  2.000000e-02  4.000000e-02

## vcov for the glmb function
vcov(glmb.D93)
#>             (Intercept)      outcome2      outcome3    treatment2   treatment3
#> (Intercept)  0.02838182 -0.0155781745 -0.0152513656 -0.0200122571 -0.020158202
#> outcome2    -0.01557817  0.0399350808  0.0166409200  0.0003499645 -0.001517464
#> outcome3    -0.01525137  0.0166409200  0.0364356097  0.0008903492 -0.001174078
#> treatment2  -0.02001226  0.0003499645  0.0008903492  0.0396406838  0.021349520
#> treatment3  -0.02015820 -0.0015174644 -0.0011740780  0.0213495205  0.042582346

## confint for the glm function
confint(glm.D93)
#> Waiting for profiling to be done...
#>                  2.5 %      97.5 %
#> (Intercept)  2.6958215  3.36655581
#> outcome2    -0.8577018 -0.06255840
#> outcome3    -0.6753696  0.08244089
#> treatment2  -0.3932548  0.39325483
#> treatment3  -0.3932548  0.39325483

## confint for the glmb function
confint(glmb.D93)
#>                   2.5%       97.5%
#> (Intercept)  2.6986617  3.34887586
#> outcome2    -0.8132475 -0.04175357
#> outcome3    -0.6634643  0.08700805
#> treatment2  -0.4046731  0.37168418
#> treatment3  -0.3935361  0.40317213

These model statistics are useful when comparing different model specifications. The Bayesian versions of these will be discussed in greater detail in a separate Vignette.

Chapter A03: Methods available in glmbayes

Kjell Nygren

2026-05-18

References