---
title: "TriLLIEM"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{TriLLIEM}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
## Initial setup
To get started with using \textbf{TriLLIEM}, we install from GitHub via the following commands:
```{r setup, eval=FALSE}
devtools::install_github("KevinHZhao/TriLLIEM")
```
Then we load the package with
```{r load_package}
library(TriLLIEM)
```
## Simulating triad data
Simulating data in \textbf{TriLLIEM} is all performed through the \texttt{simulateData()} function.
For example, we can simulate $1,000$ case-triads from a population under HWE with a minor allele frequency of $30\%$, $S = 1.2$ and no other genetic factors by using the commands:
```{r simplesim}
set.seed(123) # setting random seed for reproducibility
matdat <-
simulateData(
nCases = 1000,
maf = 0.3,
mtCoef = c(1,1,1),
S = c(1, 1.2, 1.2^2)
)
```
More complex simulations can also be run.
For example, consider the following conditions:
* $1,000$ case-triads
* $1,000$ control-triads
* $30\%$ minor allele frequency
* $(C_1, C_2, C_4) = c(0.5,0.5,0.5)$
* $S=1.2$
* $I_{\textrm{M}} = 1.4$
* $30\%$ environmental exposure frequency in the population
* $V = 1.5$ with maternal imprinting by environment interactions
We can obtain the appropriate simulated data by running:
```{r complexsim}
impdat <-
simulateData(
nCases = 2000,
nControl = 1000,
maf = 0.3,
S = c(1, 1.2, 1.2^2),
V = c(1, 1.5, 1.5^2),
mtCoef = c(0.5, 0.5, 0.5),
Im = 1.4,
propE = 0.3,
Einteraction = "Im"
)
```
## Analyzing triad data
Analyzing triad data sets is performed through the \texttt{TriLLIEM()} function.
This function takes a vector of characters for effects to include in the model, \texttt{effects}.
It also allows users to specify the mating type model to use, and whether to use the stratified or non-stratified approach for environmental interactions (non-stratified by default).
With the first simulated data set above, an appropriate model would use an HWE mating type model with maternal effects only, hence we would run the command:
```{r simpleres}
matres <-
TriLLIEM(
mtmodel = "HWE",
effects = c("M"),
dat = matdat
)
```
The \texttt{TriLLIEM()} function returns an object of class \texttt{TriLLIEM}, which inherits from the \texttt{glm} object in R.
This means users can apply many of the usual functions for interpreting results from the \texttt{glm()} function.
To interpret \texttt{matres}, we would run:
```{r simpleinterp}
matres |> summary() |> coef()
```
The "Estimate" column for the $\textrm{M}$ row represents our point estimate for $s$, which we can exponentiate to obtain the equivalent risk parameter, $S$.
Based on these results, the model estimates $\hat{S} = 1.24$ with a p-value of $0.00163$.
Since the data were simulated for a population with $S = 1.2$, this level of significance is expected.
Likewise, for the \texttt{impdat} data, we would use an MaS mating type model, with $\textrm{S}, I_{\textrm{M}}, \textrm{E}:_{\textrm{M}}$ as genetic effects in the hybrid model.
```{r complexres}
impres <-
TriLLIEM(
mtmodel = "MaS",
effects = c("M", "Im", "E:Im"),
dat = impdat,
includeE = TRUE,
includeD = TRUE
)
impres |> summary() |> coef()
```
From the results for genetic effects, the model estimates $\hat{\textrm{S}} = 1.19, \hat{I}_{\textrm{M}} = 1.33, \hat{V} = 1.50$ with respective p-values $0.0594, 0.0199, 0.0269$.
Note that these results use the non-stratified approach.
For results equivalent to the stratified approach that are equivalent to \textbf{EMIM} and \textbf{Haplin}, we use:
```{r complexres_strat}
impres_strat <-
TriLLIEM(
mtmodel = "MaS",
effects = c("M", "Im", "E:Im"),
dat = impdat,
includeE = TRUE,
Estrat = TRUE,
includeD = TRUE
)
impres_strat |> summary() |> coef()
```
This model (which adds a $\textrm{E}:\text{M}$ effect due to stratification), has $\hat{S} = 1.18$, $\hat{I}_\textrm{M} = 1.34$, and $\hat{V} = 1.46$, with respective p-values $0.0135, 0.0237, 0.0121$.
Note that for comparison with \textbf{EMIM} and \textbf{Haplin}, we can only compare the $\hat{V}$ estimates, since the stratified approach does not produce equivalent estimates for the main effects.
## Likelihood ratio tests
\textbf{TriLLIEM} allows users to compute log-likelihoods and perform likelihood ratio tests with the `anova()` function, similar to typical glm's.
For example, we can construct two MS models for `example_dat4R`, one with effects $\textrm{C}, \textrm{M}$, and another that is nested with effects $\textrm{C}, \textrm{M}, I_{\textrm{M}}$.
```{r model_compare_fit}
model_1 <- TriLLIEM(dat = example_dat4R, effects = c("C", "M"))
model_2 <- TriLLIEM(dat = example_dat4R, effects = c("C", "M", "Im"))
```
Then to compare the two models using the likelihood ratio test, we simply run:
```{r model_anova}
anova(model_1, model_2)
```