The objective of this package is to summarize characteristics of
linear mixed effects models (LMMs) without data or a fitted model. Its
functions convert code for fitting nlme::lme()
and
lme4::lmer()
models into tables, equations, and visuals.
These outputs can be used for learning how to fit LMMs in R and
communicating about LMMs in presentations, manuscripts, and analysis
plans.
To install this package, you can use CRAN (the central R package repository).
In many settings, multiple samples are collected from the same individual or group (e.g., achievement scores of students from different schools and classrooms within schools; impact of different diets on the weights of individuals sampled once a month for six months). This type of data is often referred to as multilevel and can be analyzed using LMMs. LMMs can be fit in R using the following packages and functions.
library(lme4)
lme4::lmer(formula = outcome ~ fixed_effects + random_effects)
library(nlme)
nlme::lme(fixed = outcome ~ fixed_effects, random = random_effects)
The outcome and fixed effects are similar to that of a linear model.
lmer(formula = Y ~ 1 + X1 + X2 + random_effects)
lme(fixed = Y ~ 1 + X1 + X2, random = random_effects)
The random effects are specified with respect to groups or grouping factors (GFs), which are factor variables under which observations are nested or grouped. Using different notations, users can specify random intercepts, random slopes, and more to be estimated by the model.
# Random Intercept
lmer(formula = outcome ~ fixed_effects + (1|GF))
lme(fixed = outcome ~ fixed_effects, random = ~1|GF)
lme(fixed = outcome ~ fixed_effects, random = list(GF=~1))
# Random Intercept & Slope
lmer(formula = outcome ~ fixed_effects + (1 + X1|GF))
lme(fixed = outcome ~ fixed_effects, random = ~1 + X1|GF)
lme(fixed = outcome ~ fixed_effects, random = list(GF=~1 + X1))
extract_equation()
returns a ‘LaTeX’ model equation.
This function has three output options.
$$
\begin{aligned}
\operatorname{score}_{ij} &= \beta_{0} + \beta_{1}(\operatorname{age}) \\
&+ u_{0i} \\
&+ \epsilon_{ij} \\
u_{0i} &\sim N \left(0, \tau^2_{u_{0}} \right), \text{ for subject i = 1,} \dots \text{,a} \\
\epsilon_{ij} &\sim N \left(0, \sigma^2 \right), \text{ for Observation j = 1,} \dots \text{,n}
\end{aligned}
$$
To see the equation, copy and paste the output into any file that supports ‘LaTeX’ equations. Below is an example of what will happen if you paste the output from above into ‘R Markdown’.
\[ \begin{aligned} \operatorname{score}_{ij} &= \beta_{0} + \beta_{1}(\operatorname{age}) \\ &+ u_{0i} \\ &+ \epsilon_{ij} \\ u_{0i} &\sim N \left(0, \tau^2_{u_{0}} \right), \text{ for subject i = 1,} \dots \text{,a} \\ \epsilon_{ij} &\sim N \left(0, \sigma^2 \right), \text{ for Observation j = 1,} \dots \text{,n} \end{aligned} \]
[1] "$$\n\\begin{aligned}\n \\operatorname{score}_{ij} &= \\beta_{0} + \\beta_{1}(\\operatorname{age}) \\\\\n &+ u_{0i} \\\\\n &+ \\epsilon_{ij} \\\\\n u_{0i} &\\sim N \\left(0, \\tau^2_{u_{0}} \\right), \\text{ for subject i = 1,} \\dots \\text{,a} \\\\\n \\epsilon_{ij} &\\sim N \\left(0, \\sigma^2 \\right), \\text{ for Observation j = 1,} \\dots \\text{,n} \n\\end{aligned}\n$$\n"
extract_variables()
returns a data frame with
information on the variables in the model. The columns of the data frame
include:
Effect
- whether the effect is random or fixedGroup
- group or grouping factor associated with random
effectsTerm
- notation used to include the variable in the
modelDescription
- description of the Term
Parameter
- parameter estimated when the model is
fitNote: The data frames for the examples below are
displayed using kbl()
from the kableExtra package. Below
are examples of how this function works for different models.
By default, all predictor variables are assumed to be numeric.
Effect | Group | Term | Description | Parameter |
---|---|---|---|---|
fixed | 1 | intercept | (Intercept) | |
fixed | sex | numeric | sex | |
fixed | time | numeric | time | |
fixed | I(time^2) | numeric | I(time^2) | |
random | ID | ~1+…+…|ID | ID-specific intercepts | SD (Intercept) |
random | ID | ~…+time+…|ID | ID-specific slopes for time | SD (time) |
random | ID | ~…+…+I(time^2)|ID | ID-specific slopes for I(time^2) | SD (I(time^2)) |
random | ID | Cor (Intercept,time) | ||
random | ID | Cor (Intercept,I(time^2)) | ||
random | ID | Cor (time,I(time^2)) | ||
random | SD (Residual) |
To include one or more categorical predictors, use
cat_vars
and cat_vars_nlevels
. In addition,
fixed and random intercepts are added to the data frame unless
explicitly excluded (e.g., “age-1” and “0+age”).
extract_variables(model = "lme(Score~type+age*sex,random=list(School=pdDiag(~type),Class=~1))",
cat_vars = c("type", "sex"),
cat_vars_nlevels = c(3, 2))
Effect | Group | Term | Description | Parameter |
---|---|---|---|---|
fixed | 1 | intercept | (Intercept) | |
fixed | type | categorical | typeB | |
fixed | type | categorical | typeC | |
fixed | age | numeric | age | |
fixed | sex | categorical | sexB | |
fixed | age*sex | interaction | age:sexB | |
random | School | School=pdDiag(~1+…) | School-specific intercepts | SD (Intercept) |
random | School | School=pdDiag(~…+type) | School-specific slopes for typeB | SD (typeB) |
random | School | School=pdDiag(~…+type) | School-specific slopes for typeC | SD (typeC) |
random | Class (within School) | Class=~1 | Class-specific intercepts | SD (Intercept) |
random | SD (Residual) |
There are two ways to run extract_structure()
to get an
image of the multilevel data structure.
lme()
or
lmer()
code.Below are examples of both approaches for one, two, and three groups.
To specify the number of Subjects, use gf_nlevels
.
To remove the labels for the levels on the left side of the image,
use label_levels
.
When there are two groups, they can either be crossed (e.g., every level of Worker is observed for every level of Machine)
or nested (e.g., class is nested within school).
These figures can also be created using the following code.
When there are three groups, they can also be crossed or nested.
extract_structure(n_gf = 3,
gf_description = "crossed",
gf_names = c("District", "School", "Class"),
gf_nlevels = c(8, 15, 5))
The index used for the highest-level group or the group that entered
the model
or gf_names
first can be left as
"i"
(see above) or redefined using
gf3_index
.
extract_structure(n_gf = 3,
gf_description = "nested",
gf_names = c("District", "School", "Class"),
gf_nlevels = c(8, 15, 5),
gf3_index = "q")
There may also be a combination of nested and crossed groups. For example, one group may be crossed with two nested groups (i.e. crossed with nested).
extract_structure(n_gf = 3,
gf_description = "crossed with nested",
gf_names = c("GF1", "GF2", "GF3"))
Another example is when two crossed groups are nested within another group (i.e. crossed within nested).
If you wish to edit the output beyond what is offered by
extract_structure()
, set export_type
=
"text"
. After editing the text, the figure can be created
using grViz()
in the DiagrammeR package.
The current version of this package does not work for all possible
lme()
or lmer()
models. However, its functions
do work for:
# One
(1|GF)
random=~1|GF
random=list(GF=~1)
# Two
(1|GF1)+(1|GF2)
(1|GF1/GF2)
random=~1|GF1/GF2
# Three
(1|GF1/GF2/GF3)
(1|GF1)+(1|GF2/GF3)
random=list(GF1=~1,GF2=~1,GF3=~1)
# One
(1|GF)
# Two
(1+X1|GF)
(X1|GF)
# Three
(1+X1+X2|GF)
(X1+X2|GF)
# Four
(1+X1+X2+X3|GF)
(X1+X2+X3|GF)
"*"
notation# Two-way
X1 + X2 + X1:X2
X1*X2
# Three-way
X1 + X2 + X3 + X1:X2 + X1:X3 + X2:X3 + X1:X2:X3
X1*X2*X3
# Four-way
X1 + X2 + X3 + X3 + X1:X2 + X1:X3 + X1:X4 + X2:X3 + X2:X4 +
X1:X2:X3 + X1:X2:X4 + X1:X4:X3 + X4:X2:X3 + X1:X2:X3:X4
X1*X2*X3*X4 # does not work
Here are some additional tips for using this package:
model
input is code for fitting a
lmer()
model, only the first input is used and it is
assumed to be the formula
input. While other information
can be included, it will be ignored.# works
model = "lmer(formula = outcome ~ fixed_effects + random_effects)"
model = "lmer(outcome ~ fixed_effects + random_effects)"
model = "lmer(outcome ~ fixed_effects + random_effects, data, ...)"
# does not work
model = "lmer(data, formula = outcome ~ fixed_effects + random_effects)"
model = "lmer(data, outcome ~ fixed_effects + random_effects)"
model
input is code for fitting a
lme()
model, only the fixed
and
random
inputs are used. The fixed
input is
assumed to be the first input and the random
input must
include random =
. Other information can be included, but it
will be ignored.# works
model = "lme(fixed = outcome ~ fixed_effects, random = random_effects, data, ...)"
model = "lme(outcome ~ fixed_effects, random = random_effects)"
# does not work
model = "lme(outcome ~ fixed_effects, random_effects, data, ...)"
model = "lme(random = random_effects, fixed = outcome ~ fixed_effects)"