---
title: "Chapter 01: Getting started with glmbayes"
author: "Kjell Nygren"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette
bibliography: REFERENCES.bib
reference-section-title: References
vignette: >
  %\VignetteIndexEntry{Chapter 01: Getting started with glmbayes}
  %\VignetteEngine{knitr::rmarkdown_notangle}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
# Prevent knitr from generating tangled .R files
knitr::opts_knit$set(tangle = FALSE)

# Prevent rmarkdown/knitr from keeping .md or .Rmd intermediates
knitr::opts_knit$set(keep_md = FALSE, keep_rmd = FALSE)

# Your existing chunk options
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

```

```{r setup,echo = FALSE}
library(glmbayes)
```

# 1. Introductory Discussion

The **glmb()** family of functions provides fully Bayesian analogues to **glm()** and its standard methods [@McCullagh1989; @VenablesRipley2002]. This vignette demonstrates how to set up and call **glmb()**, compares its syntax and output to **glm()**, and highlights the additional arguments required for prior specification. A specialized **lmb()** function for Bayesian linear regression is bundled in the package and will be explored in Chapter 2. For Bayesian regression perspective, see [@Gelman2013]. Separate vignettes cover all complementary methods for both **lmb()** and **glmb()**.



# 2. Preparing a dataframe

We begin by reproducing the randomized controlled trial example of [@Dobson1990], which is the standard demo for **glm()**. Comparing **glmb()** output to this familiar dataset will make parallels clear. The data frame contains treatment groups, outcomes, and observed counts:


```{r dobson}
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
```

# 3. Calling the two functions

The example code for the glm function specifies a Poisson regression model for this data. The below code chunks shows how the glmb function can be used in much the same fashion but with some extra requirements. In particular, we need to provide a prior distribution (in this case a Multivariate Normal prior) and related constants (in this case, a prior Mean and and a prior Variance-Covariance matrix) for the coefficients of interest. Optionally, we can also tell the function how many draws to make from the posterior distribution (the default used here generates 1000 iid draws). 

## 3.1 Using the classical glm function

## 3.1.1 Calling the classical glm function

The base glm() call takes two main arguments:

1)  a formula describing the model structure (e.g., counts ~ outcome + treatment)
2)  a family object specifying both the error distribution and link function (e.g., **poisson()**)

We cover families, link functions, and their glmb() implementations in detail in another vignette.


```{r glm_call,results = "hide"}
## Call to glm
glm.D93 <- glm(counts ~ outcome + treatment, 
              family = poisson())
```

## 3.2 Setting the Prior and Calling the glmb function

Before invoking glmb(), we must define a prior. The package supplies a helper, Prior_Setup(), which returns appropriately sized default prior parameters and variable names.


### 3.2.1 Setting the prior

Use **Prior_Setup()** to generate default prior hyperparameters and extract the design matrix variable names:

```{r Prior_Setup,results = "hide"}
## Using glmb
## Step 1: Set up Default Prior 
ps=Prior_Setup(counts ~ outcome + treatment,family=poisson())
mu=ps$mu
V=ps$Sigma
```
### 3.2.2 Calling the glmb function

With the prior defined, the glmb() call mirrors glm(), augmented by the pfamily argument:


```{r Call_glmb,results = "hide"}
# Step 3: Call the glmb function
glmb.D93<-glmb(counts ~ outcome + treatment, family=poisson(), pfamily=dNormal(mu=mu,Sigma=V))
```

Because Prior_Setup() aligns the dimensions of mu and Sigma with the design matrix, you can safely modify them to impose custom shrinkage on selected coefficients


# 4. Printing the output

Both glm() and glmb() provide concise print() methods. Below we compare their default printed summaries.

## 4.1 Printed glm output

```{r Printed_glm_Views}
## Printed view of the output from the glm function 
print(glm.D93)
```
#### 4.1.1 Coefficients

The Coefficients table reports the maximum‐likelihood estimates for the intercept and each predictor on the log‐scale, summarizing how each covariate shifts the expected log‐count from its baseline. Later vignettes will show how these MLEs compare to the Bayesian posterior means produced by `glmb()`.

#### 4.1.2 Null and Residual Degrees of Freedom

The Degrees of Freedom line gives the total df for the null (intercept‐only) model and the residual df for the full model (observations minus estimated parameters). These dfs underpin chi‐square tests on deviance reductions and the correct calibration of standard errors.

#### 4.1.3 Residual Deviance

Residual deviance quantifies the lack of fit of the fitted model relative to a perfect (saturated) model, with smaller values indicating better fit. In later sections we’ll use both null and residual deviance to assess model adequacy and perform formal likelihood‐ratio tests.

#### 4.1.4 AIC

The Akaike Information Criterion (AIC) combines model fit with a penalty for complexity by adding twice the number of parameters to the deviance. Models with lower AIC are preferred, guiding model selection in both classical GLM workflows and their Bayesian analogues.



## 4.2 Printed glmb output

```{r Printed_glmb_Views}

## Printed view of the output from the glmb function 
print(glmb.D93)
```
#### 4.2.1 Posterior Mean Coefficients

The Posterior Mean Coefficients table reports the Bayesian point estimates for each parameter under the specified prior, summarizing how each covariate shifts the expected log‐count after combining data and prior information. Later vignettes will explore how these posterior means compare to classical MLEs and how prior choice induces shrinkage.

#### 4.2.2 Effective Number of Parameters

The Effective Number of Parameters quantifies model complexity by measuring how much the posterior distribution adapts to the data, often shrinking the apparent degrees of freedom under informative priors. We’ll contrast this with the nominal parameter count and show its role in penalizing overfitting.

#### 4.2.3 Expected Residual Deviance

Expected Residual Deviance is the posterior mean of the deviance, reflecting the model’s average lack of fit under its own predictive distribution. In subsequent vignettes we’ll use this quantity for Bayesian goodness‐of‐fit checks and compare it to classical residual deviance.

#### 4.2.4 DIC

The Deviance Information Criterion (DIC) combines expected residual deviance with a penalty for effective complexity, balancing fit and parsimony in a single metric. Models with lower DIC are preferred, and later sections will delve into its computation and limitations.


# 5. Textbook Content Mappings

## 5.1 Agresti - Foundations of Linear and Generalized Linear Models

The section references in the table below follow @Agresti2015.

| Vignette Section                           | Textbook Chapter & Section                                           | Notes (provisional)                                                                                                                                         |
|--------------------------------------------|----------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3.1.1 Calling the glm function             | 7.1 Poisson GLMs for Counts and Rates                                | Illustrates the canonical log-link for count data, Example 7.1 on log-rate modeling and interpretation via rate ratios.                                      |
| 3.2.1 Setting the prior                    | 10.2 Bayesian Linear Models                                          | Discusses conjugate normal priors and the g-prior; shows how \(V \propto (X^\top X)^{-1}\) standardizes shrinkage across coefficients (Example 10.2).       |
| 3.2.2 Calling the glmb function            | 10.3 Bayesian Generalized Linear Models                              | Introduces Bayesian GLM framework, Laplace approximation for marginal likelihood, and basic MCMC/penalized-likelihood connections (Example 11.3).            |
| 4.1.1 Coefficients                         | 7.1 Poisson GLMs for Counts and Rates                                | Details how to interpret coefficients on the log scale as multiplicative effects on the mean; see Example 8.4’s discussion of log-rate differences.          |
| 4.1.2 Null and Residual Degrees of Freedom | 4.4 Deviance of a GLM, Model Comparison, and Model Checking          | Defines null vs. residual deviance, links to chi-square tests for nested models, and outlines df calculations for goodness-of-fit testing.                  |
| 4.1.3 Residual Deviance                    | 4.4 Deviance of a GLM, Model Comparison, and Model Checking          | Covers deviance residuals, Pearson residuals, and formal goodness-of-fit metrics based on the deviance.                                            |
| 4.1.4 AIC                                  | 4.6 Selecting Explanatory Variables for a GLM                        | Presents AIC derivation \(\!D + 2p\), compares to BIC, and discusses selection of parsimonious models via information criteria.                              |
| 4.2.1 Posterior Mean Coefficients          | 10.2 Bayesian Linear Models                                          | Shows how posterior means serve as point estimates; illustrates shrinkage toward zero under informative priors (Example 11.5).                             |
| 4.2.2 Effective Number of Parameters       | 10.3 Bayesian Generalized Linear Models                              | Defines effective parameter count \(p_D = \bar D - D(\bar\theta)\); discusses its role in penalizing model complexity.                                       |
| 4.2.3 Expected Residual Deviance           | 10.3 Bayesian Generalized Linear Models                              | Explains computation of \(\bar D\), the posterior-mean deviance, and its use in posterior predictive checks (see that chapter).                                |
| 4.2.4 DIC                                  | 10.3 Bayesian Generalized Linear Models                              | Derives DIC \(\!= \bar D + p_D\); compares DIC to AIC/BIC and highlights limitations when priors are highly informative.                                      |