1. Introductory Discussion

Binomial generalized linear models (GLMs) are used when the response represents binary outcomes (success/failure) or proportions (successes out of trials). They are among the most widely used GLMs in applied statistics, powering models for:

clinical trial response rates
yes/no survey outcomes
conversion rates in marketing
event occurrence indicators
grouped binomial data (successes out of n)

Binomial regression is a standard generalized linear model (Nelder and Wedderburn 1972; McCullagh and Nelder 1989; Agresti 2015).

In classical statistics, these models are fit using:

glm(..., family = binomial(link = ...))

In glmbayes, the Bayesian analogue is:

glmb(..., family = binomial(link = ...), pfamily = dNormal(mu, Sigma))

This chapter introduces:

the structure of binomial GLMs
the available link functions (logit, probit, cloglog)
how to specify these models in glmbayes
worked examples for each link function

We build on the foundations from Chapters 05 and 06, especially the role of link functions, log‑concavity, and prior specification.

2. Binomial Likelihood and Weighted Formulation

Binomial data arise in several equivalent representations:

Binary outcomes: \(y_i \in \{0,1\}\)
Proportions: \(y_i = \text{successes}_i / n_i\) with weights \(w_i = n_i\)
Two‑column matrix: \(\text{cbind(successes, failures)}\)

In all cases, the underlying sampling model is

\[ Y_i \sim \text{Binomial}(n_i, \mu_i), \qquad 0 < \mu_i < 1, \]

where: - \(n_i\) is the number of trials, - \(\mu_i = \Pr(Y_i = 1)\) is the success probability.

2.1 Linear predictor and mean structure

A binomial GLM links the mean \(\mu_i\) to a linear predictor through

\[ \eta_i = x_i^\top \beta, \qquad \mu_i = g^{-1}(\eta_i), \]

where \(g(\cdot)\) is the chosen link function (logit, probit, cloglog, etc.).

2.2 Weighted binomial log‑likelihood

Using weights \(w_i = n_i\), the log‑likelihood (up to constants) becomes

\[ \ell(\beta) = \sum_{i=1}^n w_i\Big[ y_i \log(\mu_i) + (1-y_i)\log(1-\mu_i) \Big]. \]

This form is used by both glm() and the Bayesian functions glmb() and rglmb().

2.3 Exponential‑family representation

The binomial likelihood belongs to the exponential family (McCullagh and Nelder 1989; Agresti 2015).
For a model with linear predictor \[ \eta_i = x_i^\top \beta, \] and mean \[ \mu_i = g^{-1}(\eta_i), \] the contribution of observation \(i\) to the log‑likelihood can be written as \[ \ell_i(\beta) = w_i\Big[ y_i \log(\mu_i) + (1-y_i)\log(1-\mu_i) \Big], \] where \(w_i\) is the number of trials (or a user‑supplied weight).

This representation does not require the link to be canonical.
The variance of a binomial observation is always \[ \mathrm{Var}(Y_i) = \mu_i(1-\mu_i), \] regardless of the link function.

2.4 Link functions

The link function determines how the mean response relates to the linear predictor: \[ g(\mu_i) = \eta_i. \]

The binomial() family in R supports several link functions:

Link Function	Formula	Notes
logit (canonical)	\(\eta = \log(\mu/(1-\mu))\)	Most common; canonical link; induces log‑concavity in \(\eta\)
probit	\(\eta = \Phi^{-1}(\mu)\)	Based on the standard normal CDF; induces log‑concavity in \(\eta\)
cloglog	\(\eta = \log[-\log(1-\mu)]\)	Asymmetric; useful for rare events; induces log‑concavity in \(\eta\)
cauchit	\(\eta = \tan[\pi(\mu - 1/2)]\)	Heavy‑tailed; does not preserve log‑concavity in general
identity	\(\eta = \mu\)	Must ensure \(0 < \mu < 1\); does not preserve log‑concavity in general

In this chapter we focus on the three most commonly used links:

logit (Section 4)
probit (Section 5)
cloglog (Section 6)

Each link produces a different transformation \(g^{-1}(\eta)\) and therefore a different expression for the log‑likelihood and its derivatives. For the logit, probit, and cloglog links, the resulting log‑likelihood is known to be log‑concave in the linear predictor \(\eta\), which is crucial for stable envelope construction and accept–reject sampling in glmbayes.

The explicit formulas for each link are provided at the beginning of their respective sections.

3. Specifying Binomial Models in glmbayes

The general Bayesian call is:

glmb(
  formula,
  family   = binomial(link = "logit" | "probit" | "cloglog"),
  pfamily  = dNormal(mu = mu, Sigma = V),
  data     = ...
)

3.1 Prior Specification

As in earlier chapters, the recommended workflow is:

ps <- Prior_Setup(formula, family = binomial(link = "logit"), data = ...)
mu <- ps$mu
V  <- ps$Sigma

This produces:

a Zellner‑type g‑prior (Griffin and Brown 2010)
prior means aligned with the null model
prior variances scaled relative to the likelihood curvature

You may override these defaults for more informative priors (see Chapter 10).

4. Logit Link: Likelihood, Prior, Posterior

For the logit link: \[ \eta_i = \log\!\left(\frac{\mu_i}{1-\mu_i}\right), \qquad \mu_i = \frac{1}{1+e^{-\eta_i}}. \]

Weighted Log-Likelihood

\[ \ell_{\text{logit}}(\beta) = \sum_{i=1}^n w_i\Big[ y_i\,\eta_i - \log(1+e^{\eta_i}) \Big]. \]

Normal Prior

\[ \log p(\beta) = -\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0) + \text{const}. \]

Log-Posterior

\[ \log p(\beta \mid y) = \sum_{i=1}^n w_i\Big[ y_i\,\eta_i - \log(1+e^{\eta_i}) \Big] - \tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0) + \text{const}. \]

The logit link is the canonical choice for binomial GLMs (McCullagh and Nelder 1989; Agresti 2015):

\[ \eta = \log\left(\frac{\mu}{1-\mu}\right) \]

It is symmetric and interpretable in terms of log‑odds.

4.1 Example Data

We use the Menarche dataset introduced in Chapter 06:

data(menarche,package="MASS")
Age2 <- menarche$Age - 13
Menarche_Model_Data <- data.frame(
  Age = menarche$Age,
  Total = menarche$Total,
  Menarche = menarche$Menarche,
  Age2 = Age2
)
Menarche_Model_Data
#>      Age Total Menarche  Age2
#> 1   9.21   376        0 -3.79
#> 2  10.21   200        0 -2.79
#> 3  10.58    93        0 -2.42
#> 4  10.83   120        2 -2.17
#> 5  11.08    90        2 -1.92
#> 6  11.33    88        5 -1.67
#> 7  11.58   105       10 -1.42
#> 8  11.83   111       17 -1.17
#> 9  12.08   100       16 -0.92
#> 10 12.33    93       29 -0.67
#> 11 12.58   100       39 -0.42
#> 12 12.83   108       51 -0.17
#> 13 13.08    99       47  0.08
#> 14 13.33   106       67  0.33
#> 15 13.58   105       81  0.58
#> 16 13.83   117       88  0.83
#> 17 14.08    98       79  1.08
#> 18 14.33    97       90  1.33
#> 19 14.58   120      113  1.58
#> 20 14.83   102       95  1.83
#> 21 15.08   122      117  2.08
#> 22 15.33   111      107  2.33
#> 23 15.58    94       92  2.58
#> 24 15.83   114      112  2.83
#> 25 17.58  1049     1049  4.58

4.2 Prior Setup

ps <- Prior_Setup(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family = binomial(link = "logit"),
  data = Menarche_Model_Data
)
#> Using default pwt = 0.01 (low-d default).
mu <- ps$mu
V  <- ps$Sigma

4.3 Fit the Model

glmb.logit <- glmb(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family  = binomial(link = "logit"),
  pfamily = dNormal(mu = mu, Sigma = V),
  data    = Menarche_Model_Data,
  n       = 1000
)

4.4 Summary

summary(glmb.logit)
#> Call
#> glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(link = "logit"), 
#>     pfamily = dNormal(mu = mu, Sigma = V), n = 1000, data = Menarche_Model_Data)
#> 
#> Expected Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -2.0824718 -1.0330982 -0.4480634  0.7111732  1.3177979 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)   0.36015    0.36015  0.62794  -0.01081   0.063
#> Age2          0.00000    0.00000  0.58658   1.63197   0.059
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>             Post.Mode Post.Mean   Post.Sd  MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -0.007133 -0.007972  0.059829  0.001892      0.000000        0
#> Age2         1.615870  1.619427  0.057962  0.001833      0.000000        0
#>             Pr(Prior_tail)    
#> (Intercept)         <2e-16 ***
#> Age2                <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 28.4593  28.4593
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                  1.0%      2.5%      5.0%    Median     95.0%     97.5% 99.0%
#> (Intercept) -0.140907 -0.118122 -0.105344 -0.007882  0.093565  0.111435 0.133
#> Age2         1.495889  1.510997  1.523279  1.618876  1.713978  1.730193 1.760
#> 
#> Effective Number of Parameters: 1.891412 
#> Expected Residual Deviance: 28.64198 
#> DIC: 114.5852 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.318

This produces:

posterior means and modes
posterior standard deviations
tail probabilities
credible intervals
DIC and effective parameter count (Spiegelhalter et al. 2002)

The slope parameter typically shows strong evidence of increasing probability of menarche with age.

5. Probit Link: Likelihood, Prior, Posterior

The probit link is a common alternative to the logit when a latent normal formulation is convenient (McCullagh and Nelder 1989; Agresti 2015).

For the probit link: \[ \eta_i = \Phi^{-1}(\mu_i), \qquad \mu_i = \Phi(\eta_i), \] where \(\Phi\) is the standard normal CDF.

Weighted Log-Likelihood

\[ \ell_{\text{probit}}(\beta) = \sum_{i=1}^n w_i\Big[ y_i \log\Phi(\eta_i) + (1-y_i)\log\big(1-\Phi(\eta_i)\big) \Big]. \]

Normal Prior

\[ \log p(\beta) = -\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0) + \text{const}. \]

Log-Posterior

\[ \log p(\beta \mid y) = \sum_{i=1}^n w_i\Big[ y_i \log\Phi(\eta_i) + (1-y_i)\log\big(1-\Phi(\eta_i)\big) \Big] - \tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0) + \text{const}. \]

It is similar to the logit link but has slightly thinner tails and a more Gaussian interpretation.

5.1 Prior Setup

ps2 <- Prior_Setup(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family = binomial(link = "probit"),
  data = Menarche_Model_Data
)
#> Using default pwt = 0.01 (low-d default).
mu2 <- ps2$mu
V2  <- ps2$Sigma

5.2 Fit the Model

glmb.probit <- glmb(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family  = binomial(link = "probit"),
  pfamily = dNormal(mu = mu2, Sigma = V2),
  data    = Menarche_Model_Data,
  n       = 1000
)

5.3 Summary

summary(glmb.probit)
#> Call
#> glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(link = "probit"), 
#>     pfamily = dNormal(mu = mu2, Sigma = V2), n = 1000, data = Menarche_Model_Data)
#> 
#> Expected Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -1.6269338 -0.9501272 -0.4845795  0.4490033  1.7722787 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)   0.22517    0.22517  0.34838  -0.01724   0.035
#> Age2          0.00000    0.00000  0.29405   0.90782   0.030
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>              Post.Mode  Post.Mean    Post.Sd   MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -0.0146442 -0.0157260  0.0356208  0.0011264     0.0000000        0
#> Age2         0.8988270  0.9011817  0.0286127  0.0009048     0.0000000        0
#>             Pr(Prior_tail)    
#> (Intercept)         <2e-16 ***
#> Age2                <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 32.9869  32.9869
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                 1.0%     2.5%     5.0%   Median    95.0%    97.5% 99.0%
#> (Intercept) -0.10180 -0.08700 -0.07546 -0.01489  0.04042  0.05193 0.059
#> Age2         0.83783  0.84325  0.85441  0.90089  0.94928  0.95990 0.968
#> 
#> Effective Number of Parameters: 1.997589 
#> Expected Residual Deviance: 24.93686 
#> DIC: 110.9862 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.284

The probit model typically yields:

slightly smaller slope estimates
slightly smaller posterior standard deviations
similar fitted probabilities

The DIC often favors probit slightly for smooth S‑shaped curves (as seen in Chapter 04).

6. Complementary Log-Log (cloglog) Link: Likelihood, Prior, Posterior

The complementary log–log link is often used for asymmetric response curves and hazard‑type interpretations (McCullagh and Nelder 1989; Agresti 2015).

For the cloglog link: \[ \eta_i = \log\!\big[-\log(1-\mu_i)\big], \qquad \mu_i = 1 - \exp\!\big(-e^{\eta_i}\big). \]

Weighted Log-Likelihood

\[ \ell_{\text{cloglog}}(\beta) = \sum_{i=1}^n w_i\Big[ y_i \log\!\big(1 - e^{-e^{\eta_i}}\big) + (1-y_i)\big(-e^{\eta_i}\big) \Big]. \]

Normal Prior

\[ \log p(\beta) = -\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0) + \text{const}. \]

Log-Posterior

\[ \log p(\beta \mid y) = \sum_{i=1}^n w_i\Big[ y_i \log\!\big(1 - e^{-e^{\eta_i}}\big) + (1-y_i)\big(-e^{\eta_i}\big) \Big] - \tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0) + \text{const}. \]

The cloglog link is asymmetric:

It is useful when:

the event probability increases rapidly near 0
the hazard interpretation is meaningful
the response curve is skewed

6.1 Prior Setup

ps3 <- Prior_Setup(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family = binomial(link = "cloglog"),
  data = Menarche_Model_Data
)
#> Using default pwt = 0.01 (low-d default).
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
mu3 <- ps3$mu
V3  <- ps3$Sigma

6.2 Fit the Model

glmb.cloglog <- glmb(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family  = binomial(link = "cloglog"),
  pfamily = dNormal(mu = mu3, Sigma = V3),
  data    = Menarche_Model_Data,
  n       = 1000
)
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

6.3 Summary

summary(glmb.cloglog)
#> Call
#> glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(link = "cloglog"), 
#>     pfamily = dNormal(mu = mu3, Sigma = V3), n = 1000, data = Menarche_Model_Data)
#> 
#> Expected Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -3.985983 -2.550284 -1.115940  1.192348  3.394714 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)   -0.1173    -0.1173   0.4094   -0.5960   0.041
#> Age2           0.0000     0.0000   0.3117    0.9530   0.031
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>              Post.Mode  Post.Mean    Post.Sd   MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -0.5910908 -0.5917062  0.0410885  0.0012993     0.0000000        0
#> Age2         0.9451221  0.9451291  0.0283513  0.0008965     0.0000000        0
#>             Pr(Prior_tail)    
#> (Intercept)         <2e-16 ***
#> Age2                <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 33.1041  33.1041
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                1.0%    2.5%    5.0%  Median   95.0%   97.5%  99.0%
#> (Intercept) -0.6868 -0.6660 -0.6567 -0.5933 -0.5199 -0.5101 -0.496
#> Age2         0.8801  0.8922  0.8997  0.9455  0.9906  1.0000  1.012
#> 
#> Effective Number of Parameters: 1.987168 
#> Expected Residual Deviance: 120.8848 
#> DIC: 206.9238 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.24

The cloglog model often fits poorly for symmetric S‑shaped curves (as shown in Chapter 04), but it is valuable for:

rare events
survival‑type interpretations
asymmetric growth curves

7. Comparing Link Functions

The Deviance Information Criterion (DIC) provides a Bayesian analogue to AIC (Spiegelhalter et al. 2002):

DIC_comp<-rbind(
  extractAIC(glmb.logit),
  extractAIC(glmb.probit),
  extractAIC(glmb.cloglog))

rownames(DIC_comp)<-c("logit","probit","cloglog")
DIC_comp
#>               pD      DIC
#> logit   1.891412 114.5852
#> probit  1.997589 110.9862
#> cloglog 1.987168 206.9238

Typical patterns:

logit and probit often perform similarly
probit may slightly outperform logit for smooth curves
cloglog is best when the response is asymmetric or hazard‑like

The effective number of parameters (pD) is part of the same framework (Spiegelhalter et al. 2002) and helps diagnose model complexity.

8. Concluding Discussion

Binomial GLMs are a core component of the glmbayes package. Their log‑concave likelihoods make them ideal for the envelope‑based accept‑reject sampler, and the familiar link functions allow analysts to choose models that match the scientific context (McCullagh and Nelder 1989; Gelman et al. 2013).

This chapter demonstrated:

how binomial GLMs are structured
how link functions differ
how to specify priors using Prior_Setup
how to fit logit, probit, and cloglog models
how to compare models using DIC

In the next chapter, we extend these ideas to Poisson models, which share many structural similarities but introduce new considerations for count data.

References

Agresti, Alan. 2015. Foundations of Linear and Generalized Linear Models. Cambridge University Press.

Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3rd ed. CRC Press.

Griffin, Jim E., and Philip J. Brown. 2010. “Inference with Normal-Gamma Prior Distributions in Regression Problems.” Bayesian Analysis 5 (1): 171–88. https://doi.org/10.1214/10-BA507.

McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Chapman; Hall.

Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society. Series A (General) 135 (3): 370–84. https://doi.org/10.2307/2344614.

Spiegelhalter, David J., Nicky G. Best, Bradley P. Carlin, and Angelika van der Linde. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (4): 583–639. https://doi.org/10.1111/1467-9868.00353.

Chapter 07: Models for the Binomial Family

1. Introductory Discussion

2. Binomial Likelihood and Weighted Formulation

2.1 Linear predictor and mean structure

2.2 Weighted binomial log‑likelihood

2.3 Exponential‑family representation

2.4 Link functions

3. Specifying Binomial Models in glmbayes

3.1 Prior Specification

4. Logit Link: Likelihood, Prior, Posterior

Weighted Log-Likelihood

Normal Prior

Log-Posterior

4.1 Example Data

4.2 Prior Setup

4.3 Fit the Model

4.4 Summary

5. Probit Link: Likelihood, Prior, Posterior

Weighted Log-Likelihood

Normal Prior

Log-Posterior

5.1 Prior Setup

5.2 Fit the Model

5.3 Summary

6. Complementary Log-Log (cloglog) Link: Likelihood, Prior, Posterior

Weighted Log-Likelihood

Normal Prior

Log-Posterior

6.1 Prior Setup

6.2 Fit the Model

6.3 Summary

7. Comparing Link Functions

8. Concluding Discussion

References