--- title: "Linear Mixed Model Methodology" format: html: toc: true number-sections: true css: styles.css execute: warning: false message: false --- ```{r} #| include: false library(methods) if (requireNamespace("lme4", quietly = TRUE)) library(lme4) if (requireNamespace("lmerTest", quietly = TRUE)) library(lmerTest) load(file.path("..", "data", "ex125.RData")) ``` ## Model Form The book presents linear mixed models in the form $$ Y = X\beta + Zb + e, $$ where $Y$ is the response vector, $X$ is the fixed effect design matrix, $\beta$ is the vector of fixed effect parameters, $Z$ is the random effect design matrix, $b$ is the random effect vector, and $e$ is the residual error. The usual assumptions are $$ b \sim N(0, G), \quad e \sim N(0, R), \quad \operatorname{Cov}(b, e) = 0. $$ The marginal variance is therefore $$ V = ZGZ^T + R. $$ ## Fixed Effects When variance components are known or estimated, generalized least squares estimates the fixed effects as $$ \hat{\beta} = (X^T V^{-1} X)^{-} X^T V^{-1}Y. $$ The generalized inverse is needed for overparameterized model descriptions, which are common in the book because the SAS examples use reference-level constraints. ## Variance Components Example 2.4.2.2 uses the `ex125` split-plot data. The book reports maximum likelihood and restricted maximum likelihood variance component estimates for region, herd within region and drug, and residual error. ```{r} if (requireNamespace("lme4", quietly = TRUE)) { fit_ml <- lme4::lmer( Pcv ~ dose * Drug + (1 | Region / Drug), data = ex125, REML = FALSE ) if (requireNamespace("report", quietly = TRUE)) { report::report(fit_ml) } fit_reml <- lme4::lmer( Pcv ~ dose * Drug + (1 | Region / Drug), data = ex125, REML = TRUE ) if (requireNamespace("report", quietly = TRUE)) { report::report(fit_reml) } data.frame( component = as.data.frame(lme4::VarCorr(fit_ml))$grp, ml = round(as.data.frame(lme4::VarCorr(fit_ml))$vcov, 3), reml = round(as.data.frame(lme4::VarCorr(fit_reml))$vcov, 3) ) } ``` These estimates match the readable printed table in the book to rounding: `4.292`, `0.322`, and `1.747` for ML, and `5.151`, `0.387`, and `2.096` for REML. ## Reporting Fitted Models The package keeps `report` optional, but fitted mixed models can be interpreted with `report_mixed_model()` when the easystats reporting package is available. This reporting step does not refit the model or change the estimates; it only turns the fitted model object into a narrative summary. The helper wraps the same guarded call to `report::report()`. ```{r} if (requireNamespace("report", quietly = TRUE) && exists("fit_reml")) { report::report(fit_reml) } else { data.frame( workflow = "Optional model report", requirement = "Install report to generate easystats-style summaries" ) } ``` ## Post Hoc Inference with emmeans Model reports and marginal means answer different questions. A report gives a narrative summary of the fitted model, while `emmeans` focuses on model-based means and comparisons among scientifically meaningful factor levels. The package helper `emmeans_mixed_model()` wraps this same guarded `emmeans` workflow for users who want a package-level entry point. ```{r} if (requireNamespace("emmeans", quietly = TRUE) && exists("fit_reml")) { dose_emm <- emmeans::emmeans( fit_reml, ~ dose | Drug, lmer.df = "asymptotic" ) dose_pairs <- emmeans::contrast(dose_emm, method = "pairwise") as.data.frame(dose_emm) } else { data.frame( workflow = "Optional marginal means", requirement = "Install emmeans to compute post hoc model summaries" ) } ``` ```{r} if (exists("dose_pairs")) { as.data.frame(dose_pairs) } ``` The marginal means summarize model-predicted packed cell volume by dose within drug. The pairwise contrasts then test the high-versus-low dose difference within each drug group, which is a post hoc interpretation layer rather than a replacement for the fitted mixed model. ## Inference Inference for fixed effects in mixed models depends on variance component estimation and denominator degrees of freedom. The package examples use `lmerTest` for Satterthwaite-style tests where the book uses SAS `PROC MIXED`. Small numerical differences are expected across software versions and methods, especially for degrees of freedom and p-values.