Introduction

The agfh package implements the Agnostic Fay-Herriot small area model (AGFH). It also contains tools for using the traditional Fay-Herriot model with similar syntax for convenience. This document gives a detailed treatment to a typical use case, analyzing toy data with both the agnostic and traditional models.

The traditional Fay-Herriot approach models areal units \(m=1,\dots,M\) in a hierarchical fashion. A top level sampling distribution corresponds to the observed response \(Y_m\), and a linking distribution governs the latent variable \(\theta_m\). The top-level precisions \(D_m\) are typically assumed to be known. \begin{equation}\label{eq:ebfay-herriot} \begin{split} &Y_m \mid \theta_m \sim N(\theta_m, D_m{-1})\ &\theta_m \mid \beta, \lambda \sim N(x{m}\top \beta, \; \lambda). \end{split} \end{equation} Its success is evident it its widespread use in small area estimation, where the goal is typically to estimate \(\theta_m\). Both frequentist and hierarchical Bayesian estimation is common.

The AGFH model replaces the Normality assumption in the sampling model with a more flexible distribution learned from the data.

Agnostic Fay-Herriot Model

In the AGFH model, the response instead is given by \begin{align} Y{m} = \theta{m} + D{m}{-½} U{m}, \ \ m = 1, \ldots, M. \label{eq:Ym_i} \end{align} where \(U_m\) are independent random variables following a distribution governed by the following equations \begin{align} & \int{u = -\infty}{\infty} \exp \Bigl{ -{\frac{u{2}}{2}} + g (u) \Bigr} du = K, \label{eq:gIntegral} \ & \int{u = -\infty}{\infty} u \exp \Bigl{ -{\frac{u{2}}{2}} + g (u) \Bigr} du = 0, \label{eq:gmean} \ & K{-1} \int{u = -\infty}{\infty} u{2} \exp \Bigl{ -{\frac{u{2}}{2}} + g (u) \Bigr} du = 1. \label{eq:gvar} \end{align} The function \(g(\cdot)\) is modeled using a Gaussian process and learned from the data.

Example

The example below demonstrates common functionality in agfh.

Data Generation

The methods null_gen, beta_err_gen, and gamma_err_gen generate data with sampling errors that are distributed according to the Normal, Beta, and Gamma distributions respectively. Note that in the Beta and Gamma cases the sampling errors are transformed so their mean and variance match the the first two moments of the traditional model.

set.seed(2023)
M <- 50
p <- 3
D <- rep(0.1, M)
lamb <- 1/2
dat <- beta_err_gen(M, p, D, lamb, 1/12, 1/6)

plot of chunk unnamed-chunk-2

AGFH Sampling

The AGFH sampler can be configured as in typical Bayesian analysis. How many samples the user requests and how many iterations it thins determine how long the sampler will run; below it will run for n.mcmc*n.thin iterations. The latent variance \(\lambda\) has an Inverse-Gamma prior on it, and its hyperparameters can be specified. As in the hierarchical Bayes model, \(\beta\) has a flat (improper) prior.

n.mcmc <- 1e2
n.thin <- 20
S.init <- seq(-10,10,length.out=M)
thet.var.prior.a <- 1
thet.var.prior.b <- 1
mh.scale.thetas <- 1

The GP governing \(g(\cdot)\) employs a radial basis kernel with given hyperparameters. Once complete, the sampler returns the initial values, hyperparameters, and posterior samples.

mhg_sampler <- make_agfh_sampler(X=dat$X, Y=dat$Y, D=dat$D,
                                 var_gamma_a=thet.var.prior.a, var_gamma_b=thet.var.prior.b,
                                 S=S.init,
                                 kern.a0=0.1,
                                 kern.a1=0.1,
                                 kern.fuzz=1e-2)
init <- list(beta=rep(0,p),
             theta= predict(lm(dat$Y ~ dat$X), data.frame(dat$X)),
             theta.var=1 ,
             gamma = rep(0, length(S.init)))

mhg.out <- mhg_sampler(init, n.mcmc, n.thin, mh.scale.thetas, trace=FALSE)

plot of chunk unnamed-chunk-5plot of chunk unnamed-chunk-5

The final state of \(g(\cdot)\) informs our understanding of the sampling errors. As seen below, the model captures the distribution sampling errors. plot of chunk unnamed-chunk-6

Hierarchical Bayes FH Sampling

agfh also contains tools to estimate the traditional Fay-Herriot model using hierarchical Bayes. In this case, the parameters \(\beta, \lambda\) are given the prior \(p(\beta, \lambda) = 1 \cdot \operatorname{IG}(a,b)\).

params.init <- list(beta=rep(0,p),
                    theta=predict(lm(dat$Y ~ dat$X), data.frame(dat$X)),#rep(0,M),
                    theta.var=1)
sampler <- make_gibbs_sampler(dat$X, dat$Y, dat$D, thet.var.prior.a, thet.var.prior.b)
gibbs.out <- sampler(params.init, n.mcmc, n.thin)

plot of chunk unnamed-chunk-8plot of chunk unnamed-chunk-8

Frequentist/Empirical Bayes

The Fay-Herriot model also admits straightforward estimation via frequentist/empirical Bayes. The EBLUP estimator of \(\theta_m\) may be found using RM_theta_eblup(...). By default, this will employ a methods-of-moments estimator of \(\lambda\). agfh also employs several other means to estimate \(\lambda\), including an adjusted REML estimate.

theta.ests.freqeb <- RM_theta_eblup(dat$X, dat$Y, dat$D)

adj_reml_thvar <- adj_resid_likelihood_theta_var_maker(dat$X, dat$Y, dat$D)
theta.var.est.adj.reml <- theta_var_est_grid(adj_reml_thvar)
theta.ests.adj.reml <- RM_theta_eblup(dat$X, dat$Y, dat$D, theta.var.est.adj.reml)

theta.ests.lm <- predict(lm(dat$Y ~ dat$X), data.frame(dat$X))

Estimating Latent Variable

Below we summarize the output from the various estimation methods. plot of chunk unnamed-chunk-10

As evidenced below, a more accurate treatment of the sampling errors improves estimates of \(\theta_m\).

Table: MSEs For Each Method

AGFH HB LM FreqEB
0.6 1.02 1.24 0.86