In spatial data analysis, spatial effects play an important role in understanding patterns and relationships between locations. Spatial effects arise when the value of a variable in one location is affected by the value of the same variable in another location. This can happen due to natural processes (e.g. the spread of air pollution) or socio-economic phenomena (e.g. house prices in one area affect prices in surrounding areas).
This vignette aims to show how this package can be used to add spatial effects in Small Area Estimation Hierarchical Bayesian (SAEHB) modeling.
The hbsaems
package can handle spatial dependence
effects (Spatial Autocorrelation) by using the :
Conditional Autoregressive (CAR) model is a spatial model that describes the relationship between adjacent areas through a local condition approach. In this model, the value of a variable in a location is directly influenced by the value of the same variable in the surrounding areas. CAR uses adjacency matrix to determine the relationship between locations.
library(hbsaems)
# Load data
data("data_fhnorm")
data <- data_fhnorm
head(data)
# Load adjacency matrix
data("adjacency_matrix_car")
adjacency_matrix_car
The data used is synthetic data in the hbsaems
package
which can be used to help understand the use of the package
hbm
Functionmodel_car <- hbm(
formula = bf(y ~ x1 + x2 + x3), # Formula model
hb_sampling = "gaussian", # Gaussian family for continuous outcomes
hb_link = "identity", # Identity link function (no transformation)
re = ~(1|group),
sre = "sre", # Spatial random effect variable
sre_type = "car",
car_type = "icar",
M = adjacency_matrix_car,
data = data) # Dataset
summary(model_car)
To apply spatial effects to the data, we need a variable that maps
observations to spatial locations (sre
). If not specified,
each observation is treated as a separate location. Therefore, in this
case the dimension of the adjacency matrix of locations should be equal
to the number of observations. It is recommended to always specify a
grouping factor to allow for handling of new data in postprocessing
methods.
Users must specify the type of spatial random effects used in the
model (sre_type
), if they choose car
type,
users must also specify the CAR structure type (car_type
).
Currently implemented are “escar” (exact sparse CAR), “esicar” (exact
sparse intrinsic CAR), “icar” (intrinsic CAR), and “bym2”. When the
car_type
value is not specifically defined, its defaults to
escar
.
We also need information about the adjacency matrix (M
).
In the spatial adjacency matrix, a value of 1 indicates that two spatial
groups are neighbors based on their proximity to the city center, while
a value of 0 indicates no adjacency relationship. The row names of the
adjacency matrix must be the same as the grouping names in the variable
sre
. In the data that will be used this time, the spatial
grouping names are given the numbers 1, 2, 3, 4, and 5.
hbm_distribution
FunctionIn addition to the hbm
function, spatial effects can be
applied by calling the hbm_distribution
function as in
hbm_betalogitnorm
, hbm_binlogitnorm
, and
hbm_lnln
.
The Simultaneous Autoregressive Model (SAR) handles spatial effects
by defining dependence simultaneously across the study area. SAR uses a
spatial weight matrix to describe the relationship between regions and
can be applied in various forms, such as Spatial Lag Model (SLM) or
Spatial Error Model (SEM), depending on how spatial effects affect the
dependent variable or error in the model. M
can be either
the spatial weight matrix itself or an object of class listw or nb, from
which the spatial weighting matrix can be computed.
The lagsar structure implements SAR of the response values:
\[y = \rho W y + \eta + e\]
The errorsar structure implements SAR of the residuals:
\[y = \eta + u, \quad u = \rho W u + e\]
In the above equations, \(\eta\) is the predictor term and \(e\) are independent normally or t-distributed residuals. Currently, only families gaussian and student support SAR structures.
model_sar <- hbm(
formula = bf(y ~ x1 + x2 + x3), # Formula model
hb_sampling = "gaussian", # Gaussian family for continuous outcomes
hb_link = "identity", # Identity link function (no transformation)
re = ~(1|group),
sre_type = "sar",
sar_type = "lag",
M = spatial_weight_sar,
data = data) # Dataset
summary(model_sar)
In SAR there is no need for sre
information, SAR only
requires information related to sre_type
,
sar_type
, and M
.sar_type
is type
of the SAR structure. Either lag
(for SAR of the response
values) or error
(for SAR of the residuals). When the
sar_type
value is not specifically defined, its defaults to
lag
.
The M
matrix in SAR is a spatial weighting matrix that
shows the spatial relationship between locations with certain
weights.
The hbsaems package supports spatially-informed small-area estimation through both Conditional Autoregressive (CAR) and Simultaneous Autoregressive (SAR) models, enabling improved precision by capturing spatial dependence among areas. CAR emphasizes local adjacency via binary neighborhood structure, while SAR allows varying spatial influence through a weighted matrix. By leveraging spatial correlation, CAR and SAR enhance the reliability of estimates, particularly in data-sparse regions, making them powerful tools for small-area estimation.