---
title: "Chapter 07: Models for the Binomial Family"
output: rmarkdown::html_vignette
bibliography: REFERENCES.bib
reference-section-title: References
vignette: >
  %\VignetteIndexEntry{Chapter 07: Models for the Binomial Family}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(glmbayes)
```


## 1. Introductory Discussion

Binomial generalized linear models (GLMs) are used when the response represents **binary outcomes** (success/failure) or **proportions** (successes out of trials). They are among the most widely used GLMs in applied statistics, powering models for:

- clinical trial response rates  
- yes/no survey outcomes  
- conversion rates in marketing  
- event occurrence indicators  
- grouped binomial data (successes out of *n*)

Binomial regression is a standard generalized linear model [@NelderWedderburn1972; @McCullagh1989; @Agresti2015].

In classical statistics, these models are fit using:

```r
glm(..., family = binomial(link = ...))
```

In **glmbayes**, the Bayesian analogue is:

```r
glmb(..., family = binomial(link = ...), pfamily = dNormal(mu, Sigma))
```

This chapter introduces:

1. the structure of binomial GLMs  
2. the available link functions (logit, probit, cloglog)  
3. how to specify these models in **glmbayes**  
4. worked examples for each link function  

We build on the foundations from Chapters 05 and 06, especially the role of link functions, log‑concavity, and prior specification.



## 2. Binomial Likelihood and Weighted Formulation

Binomial data arise in several equivalent representations:

- **Binary outcomes:** \(y_i \in \{0,1\}\)
- **Proportions:** \(y_i = \text{successes}_i / n_i\) with weights \(w_i = n_i\)
- **Two‑column matrix:** \(\text{cbind(successes, failures)}\)

In all cases, the underlying sampling model is

\[
Y_i \sim \text{Binomial}(n_i, \mu_i), \qquad 0 < \mu_i < 1,
\]

where:
- \(n_i\) is the number of trials,
- \(\mu_i = \Pr(Y_i = 1)\) is the success probability.

### 2.1 Linear predictor and mean structure

A binomial GLM links the mean \(\mu_i\) to a linear predictor through

\[
\eta_i = x_i^\top \beta, \qquad \mu_i = g^{-1}(\eta_i),
\]

where \(g(\cdot)\) is the chosen link function (logit, probit, cloglog, etc.).

### 2.2 Weighted binomial log‑likelihood

Using weights \(w_i = n_i\), the log‑likelihood (up to constants) becomes

\[
\ell(\beta)
=
\sum_{i=1}^n
w_i\Big[
y_i \log(\mu_i)
+
(1-y_i)\log(1-\mu_i)
\Big].
\]

This form is used by both `glm()` and the Bayesian functions `glmb()` and `rglmb()`.

### 2.3 Exponential‑family representation

The binomial likelihood belongs to the exponential family [@McCullagh1989; @Agresti2015].  
For a model with linear predictor
\[
\eta_i = x_i^\top \beta,
\]
and mean
\[
\mu_i = g^{-1}(\eta_i),
\]
the contribution of observation \(i\) to the log‑likelihood can be written as
\[
\ell_i(\beta)
=
w_i\Big[
y_i \log(\mu_i)
+
(1-y_i)\log(1-\mu_i)
\Big],
\]
where \(w_i\) is the number of trials (or a user‑supplied weight).

This representation does **not** require the link to be canonical.  
The variance of a binomial observation is always
\[
\mathrm{Var}(Y_i) = \mu_i(1-\mu_i),
\]
regardless of the link function.


### 2.4 Link functions

The link function determines how the mean response relates to the linear predictor:
\[
g(\mu_i) = \eta_i.
\]

The `binomial()` family in R supports several link functions:

| Link Function | Formula | Notes |
|---------------|---------|-------|
| **logit** (canonical) | \( \eta = \log(\mu/(1-\mu)) \) | Most common; canonical link; induces log‑concavity in \(\eta\) |
| **probit** | \( \eta = \Phi^{-1}(\mu) \) | Based on the standard normal CDF; induces log‑concavity in \(\eta\) |
| **cloglog** | \( \eta = \log[-\log(1-\mu)] \) | Asymmetric; useful for rare events; induces log‑concavity in \(\eta\) |
| **cauchit** | \( \eta = \tan[\pi(\mu - 1/2)] \) | Heavy‑tailed; does **not** preserve log‑concavity in general |
| **identity** | \( \eta = \mu \) | Must ensure \(0 < \mu < 1\); does **not** preserve log‑concavity in general |

In this chapter we focus on the three most commonly used links:

- **logit** (Section 4)  
- **probit** (Section 5)  
- **cloglog** (Section 6)

Each link produces a different transformation \(g^{-1}(\eta)\) and therefore a different expression for the log‑likelihood and its derivatives. For the logit, probit, and cloglog links, the resulting log‑likelihood is known to be log‑concave in the linear predictor \(\eta\), which is crucial for stable envelope construction and accept–reject sampling in *glmbayes*.

The explicit formulas for each link are provided at the beginning of their respective sections.

## 3. Specifying Binomial Models in glmbayes

The general Bayesian call is:

```r
glmb(
  formula,
  family   = binomial(link = "logit" | "probit" | "cloglog"),
  pfamily  = dNormal(mu = mu, Sigma = V),
  data     = ...
)
```

### 3.1 Prior Specification

As in earlier chapters, the recommended workflow is:

```r
ps <- Prior_Setup(formula, family = binomial(link = "logit"), data = ...)
mu <- ps$mu
V  <- ps$Sigma
```

This produces:

- a Zellner‑type g‑prior [@GriffinBrown2010]  
- prior means aligned with the null model  
- prior variances scaled relative to the likelihood curvature  

You may override these defaults for more informative priors (see Chapter 10).

---

## 4. Logit Link: Likelihood, Prior, Posterior

For the **logit link**:
\[
\eta_i = \log\!\left(\frac{\mu_i}{1-\mu_i}\right),
\qquad
\mu_i = \frac{1}{1+e^{-\eta_i}}.
\]

### Weighted Log-Likelihood
\[
\ell_{\text{logit}}(\beta)
=
\sum_{i=1}^n
w_i\Big[
y_i\,\eta_i - \log(1+e^{\eta_i})
\Big].
\]

### Normal Prior
\[
\log p(\beta)
=
-\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0)
+ \text{const}.
\]

### Log-Posterior
\[
\log p(\beta \mid y)
=
\sum_{i=1}^n
w_i\Big[
y_i\,\eta_i - \log(1+e^{\eta_i})
\Big]
-
\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0)
+ \text{const}.
\]


The **logit** link is the canonical choice for binomial GLMs [@McCullagh1989; @Agresti2015]:

\[
\eta = \log\left(\frac{\mu}{1-\mu}\right)
\]

It is symmetric and interpretable in terms of **log‑odds**.

### 4.1 Example Data

We use the **Menarche** dataset introduced in Chapter 06:

```{r Menarche}
data(menarche,package="MASS")
Age2 <- menarche$Age - 13
Menarche_Model_Data <- data.frame(
  Age = menarche$Age,
  Total = menarche$Total,
  Menarche = menarche$Menarche,
  Age2 = Age2
)
Menarche_Model_Data
```

### 4.2 Prior Setup

```{r Menarche_Prior_Logit}
ps <- Prior_Setup(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family = binomial(link = "logit"),
  data = Menarche_Model_Data
)
mu <- ps$mu
V  <- ps$Sigma
```

### 4.3 Fit the Model

```{r Menarche_Model_Logit}
glmb.logit <- glmb(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family  = binomial(link = "logit"),
  pfamily = dNormal(mu = mu, Sigma = V),
  data    = Menarche_Model_Data,
  n       = 1000
)
```

### 4.4 Summary

```{r Menarche_Summary_Logit}
summary(glmb.logit)
```

This produces:

- posterior means and modes  
- posterior standard deviations  
- tail probabilities  
- credible intervals  
- DIC and effective parameter count [@Spiegelhalter2002]  

The slope parameter typically shows strong evidence of increasing probability of menarche with age.

---

## 5. Probit Link: Likelihood, Prior, Posterior

The probit link is a common alternative to the logit when a latent normal formulation is convenient [@McCullagh1989; @Agresti2015].

For the **probit link**:
\[
\eta_i = \Phi^{-1}(\mu_i),
\qquad
\mu_i = \Phi(\eta_i),
\]
where \(\Phi\) is the standard normal CDF.

### Weighted Log-Likelihood
\[
\ell_{\text{probit}}(\beta)
=
\sum_{i=1}^n
w_i\Big[
y_i \log\Phi(\eta_i)
+
(1-y_i)\log\big(1-\Phi(\eta_i)\big)
\Big].
\]

### Normal Prior
\[
\log p(\beta)
=
-\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0)
+ \text{const}.
\]

### Log-Posterior
\[
\log p(\beta \mid y)
=
\sum_{i=1}^n
w_i\Big[
y_i \log\Phi(\eta_i)
+
(1-y_i)\log\big(1-\Phi(\eta_i)\big)
\Big]
-
\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0)
+ \text{const}.
\]

It is similar to the logit link but has slightly thinner tails and a more Gaussian interpretation.

### 5.1 Prior Setup

```{r Menarche_Prior_Probit}
ps2 <- Prior_Setup(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family = binomial(link = "probit"),
  data = Menarche_Model_Data
)
mu2 <- ps2$mu
V2  <- ps2$Sigma
```

### 5.2 Fit the Model

```{r Menarche_Model_Probit}
glmb.probit <- glmb(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family  = binomial(link = "probit"),
  pfamily = dNormal(mu = mu2, Sigma = V2),
  data    = Menarche_Model_Data,
  n       = 1000
)
```

### 5.3 Summary

```{r Menarche_Summary_Probit}
summary(glmb.probit)
```

The probit model typically yields:

- slightly smaller slope estimates  
- slightly smaller posterior standard deviations  
- similar fitted probabilities  

The DIC often favors probit slightly for smooth S‑shaped curves (as seen in Chapter 04).

---

## 6. Complementary Log-Log (cloglog) Link: Likelihood, Prior, Posterior

The complementary log–log link is often used for asymmetric response curves and hazard‑type interpretations [@McCullagh1989; @Agresti2015].

For the **cloglog link**:
\[
\eta_i = \log\!\big[-\log(1-\mu_i)\big],
\qquad
\mu_i = 1 - \exp\!\big(-e^{\eta_i}\big).
\]

### Weighted Log-Likelihood
\[
\ell_{\text{cloglog}}(\beta)
=
\sum_{i=1}^n
w_i\Big[
y_i \log\!\big(1 - e^{-e^{\eta_i}}\big)
+
(1-y_i)\big(-e^{\eta_i}\big)
\Big].
\]

### Normal Prior
\[
\log p(\beta)
=
-\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0)
+ \text{const}.
\]

### Log-Posterior
\[
\log p(\beta \mid y)
=
\sum_{i=1}^n
w_i\Big[
y_i \log\!\big(1 - e^{-e^{\eta_i}}\big)
+
(1-y_i)\big(-e^{\eta_i}\big)
\Big]
-
\tfrac12(\beta-\mu_0)^\top \Sigma_0^{-1}(\beta-\mu_0)
+ \text{const}.
\]

The **cloglog** link is asymmetric:

It is useful when:

- the event probability increases rapidly near 0  
- the hazard interpretation is meaningful  
- the response curve is skewed  

### 6.1 Prior Setup

```{r Menarche_Prior_cloglog}
ps3 <- Prior_Setup(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family = binomial(link = "cloglog"),
  data = Menarche_Model_Data
)
mu3 <- ps3$mu
V3  <- ps3$Sigma
```

### 6.2 Fit the Model

```{r Menarche_Model_Cloglog}
glmb.cloglog <- glmb(
  cbind(Menarche, Total - Menarche) ~ Age2,
  family  = binomial(link = "cloglog"),
  pfamily = dNormal(mu = mu3, Sigma = V3),
  data    = Menarche_Model_Data,
  n       = 1000
)
```

### 6.3 Summary

```{r Menarche_Summary_cloglog}
summary(glmb.cloglog)
```

The cloglog model often fits poorly for symmetric S‑shaped curves (as shown in Chapter 04), but it is valuable for:

- rare events  
- survival‑type interpretations  
- asymmetric growth curves  

---

## 7. Comparing Link Functions

The Deviance Information Criterion (DIC) provides a Bayesian analogue to AIC [@Spiegelhalter2002]:

```{r Menarche_DIC_Compare}
DIC_comp<-rbind(
  extractAIC(glmb.logit),
  extractAIC(glmb.probit),
  extractAIC(glmb.cloglog))

rownames(DIC_comp)<-c("logit","probit","cloglog")
DIC_comp

```
Typical patterns:

- **logit** and **probit** often perform similarly  
- **probit** may slightly outperform logit for smooth curves  
- **cloglog** is best when the response is asymmetric or hazard‑like  

The effective number of parameters (pD) is part of the same framework [@Spiegelhalter2002] and helps diagnose model complexity.

---

## 8. Concluding Discussion

Binomial GLMs are a core component of the glmbayes package. Their log‑concave likelihoods make them ideal for the envelope‑based accept‑reject sampler, and the familiar link functions allow analysts to choose models that match the scientific context [@McCullagh1989; @Gelman2013].

This chapter demonstrated:

- how binomial GLMs are structured  
- how link functions differ  
- how to specify priors using Prior_Setup  
- how to fit logit, probit, and cloglog models  
- how to compare models using DIC  

In the next chapter, we extend these ideas to **Poisson models**, which share many structural similarities but introduce new considerations for count data.