---
title: "MRQAP"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{MRQAP}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
MRQAP tests associations across the dyads of a network. Questions generally ask whether one relation is associated with another relation in multi-relational networks (e.g. *do marriage ties correlate with business ties?*), or whether similarity on dyadic features predict tie formation (e.g. *are same-race pairs more likely to be friends?* or generally *what shared characteristics predict friendship nominations among adolescents in an American high school?*). MRQAP is an extension of the Mantel test which uses node permutation to accommodate issues of non-independence that bias traditional regression analysis estimates using (sociocentric) network data. To illustrate how to use MRQAP with `ideanet`, we'll use the Faux Mesa High dataset native to the package.
Preparing to use `ideanet`'s MRQAP module is easy: all we need is the `igraph` object produced using `netwrite`, as this object contains the various node-level attributes contained in a user's original nodelist as well as the node-level measurements produced by `netwrite`.
```{r nw_fauxmesa, warning = FALSE}
library(ideanet)
nw_fauxmesa <- netwrite(nodelist = fauxmesa_nodes,
node_id = "id",
i_elements = fauxmesa_edges$from,
j_elements = fauxmesa_edges$to,
directed = FALSE,
net_name = "faux_mesa",
shiny = TRUE)
```
When we look at the original nodelist passed to `netwrite` (`fauxmesa_nodes`), we see that we have information about each student's grade level, their race/ethnicity, and their sex. This information exists at the *individual level* in that it pertains to individual students. However, when using MRQAP analysis, we are generally interested in how *dyadic* measures predict outcomes. In other words, we're interested in whether similarities or differences between nodes lead to outcomes, both of which are understood at the level of ego-alter relationships. While some users may have dyadic measures stored in their edgelist ahead of time, typically these measures are something one has to generate. The `qap_setup` function in `ideanet` allows users to quickly take node-level attributes and generate dyad-level comparisons with only a few arguments.
These argument are:
* `net`: An `igraph` object, preferably one generated using `netwrite`. `qap_setup` also supports pre-generated `network` object should they be available to users, but users must ensure that such objects are properly sorted to match node-level features.
* `variables`: A character vector of variable names that are available in the `igraph`/`network` object. These should match the names of node/vertex attributes in the object passed to `net`.
* `methods`: A character vector specifying the methods that should be applied to each item listed in `variables`. Values in `methods` must be specified in correct order such that the first item in `methods` applies to the first item in `variables`, the second item in `methods` the second item in `variables`, and so on. Each item in `methods` must be one of the following:
* `"multi_category"`: Applies to categorical variables only. Creates as many variables as there are unique values; each variable signals if both ego and alter have the given value.
* `"reduced_category"`: Applies to categorical variables only. Creates a single variable that signals if alter and ego have the same value.
* `"both"`: Applies to categorical variables only. Computes both the `"multi_category"` and `"reduced_category"` methods.
* `"difference"`: Computes the difference in input value between ego and alter. This method will produce two measures for each variable to which it is applied: the difference between ego and alter's respective values and the absolute value of this difference. This is generally used to estimate dyadic difference effects, such as whether differences in popularity correlate with being friends.
* `directed`: Specify if edges in the network should be interpreted as directed or undirected. Expects `TRUE` or `FALSE` logical.
For the Faux Mesa High network, let's imagine one wants to know whether same sex, race, or grade levels affect the likelihood that adolescents nominate each other as friends.
For similarity in sex, we'll want to apply the `reduced_category` method to the `sex` variable. For combinations of the `race`, we'll apply the `multi_category` method. And for `grade`, which we'll treat as a continuous variable, we'll use the `difference` method.
Given the importance of ensuring that each element in the `variables` argument corresponds to the correct element in the `methods` argument, users may find it helpful to store both vectors in a data frame prior to running `qap_setup`. This allows us to double-check that all elements appear in the correct order.
```{r var_meth, eval = FALSE}
var_methods <- data.frame(variable = c("sex", "race", "grade"),
method = c("reduced_category", "multi_category", "difference"))
var_methods
```
```{r var_meth_kable, echo = FALSE}
var_methods <- data.frame(variable = c("sex", "race", "grade"),
method = c("reduced_category", "multi_category", "difference"))
knitr::kable(var_methods)
```
Now that we've ensured everything's in the right order, let's use `qap_setup`:
```{r qap_setup}
faux_qap_setup <- qap_setup(net = nw_fauxmesa$faux_mesa,
variables = var_methods$variable,
methods = var_methods$method,
directed = FALSE)
```
`qap_setup` produces a list object containing a nodelist, an edgelist containing newly-calculated dyadic measures, and an `igraph` object with these dyadic measures. Let's quickly inspect our new edgelist to see the kinds of variables we've just created:
```{r qap_edgelist, eval = FALSE}
head(faux_qap_setup$edges)
```
```{r qap_edgelist_kable, echo = FALSE}
knitr::kable(head(faux_qap_setup$edges))
```
We now have several new variables. Variables appended with `_ego` and `_alter` represent the original values for each edge's ego and alter, respectively, as determined in our original nodelist. Our `same_sex` and `both_race` variables are binary indicators of whether ego and alter have the same value for a particular attribute. By contrast, `diff_grade` is a continuous measure showing how many grade levels ego and alter are apart from each other. Note that values in `diff_grade` may be positive or negative depending on whether or not the node designated as ego in the edgelist is in a higher grade level than the node designated as alter. Signed differences of this sort can be useful when working with directed networks — you can imagine younger students being more likely to nominate older students as friends than vice versa. Given that ties in our network are undirected, however, the order in which egos and alters appear in our edgelist have no real meaning, and whether values in `diff_grade` are positive or negative is largely a matter of chance. Consequently, we're better off using the absolute value of ego and alter's grade level in our analysis, as this measure is agnostic to the order in which nodes are presented in our edgelist. `qap_setup` provides us with this absolute value automatically — here it is stored in `abs_diff_grade`.
With our variables of interest in hand, we turn to the MRQAP analysis itself. `ideanet`'s `qap_run` function seamlessly integrates output from `netwrite` and `qap_setup`. However, in its current iteration **users must select variables produced by `qap_setup`**. Arguments for `qap_run` include:
* `net`: An `igraph` object containing the variables of interest. This `igraph` object should be one produced by `qap_setup`.
* `dependent`: A single categorical or continuous variable name to use as a dependent variable. This variable *must* be produced by `qap_setup`. If the dependent is set to `NULL` (default), the function will predict the existence of a non-weighted tie.
* `variables`: A vector of variable names to use as independent variables. These variables *must* be produced by `qap_setup`.
* `directed`: Specify if the edges should be interpreted as directed or undirected. Expects `TRUE` or `FALSE` logical.
* `reps`: Select the number of permutations. Relevant to null-hypothesis testing only. Default is set to `500`.
* `family`: The functional form the model should follow. Currently available are `"linear"` and `"binomial"`.
* **NOTE:** Binomial MRQAP can take a while to simulate; for exploratory purposes, it is recommended to stick to a linear functional form.
**NOTE:** If the input network is multi-relational, `qap_run` will automatically merge duplicated rows. This is necessary given that, at this time, the MRQAP wrapper does not elegantly handle repeated observations. When merging rows, it will take the *sum* of numeric edge attributes, and a *random* value of character edge attributes. If the user is interested in the association between two types of ties (e.g., marriage ties predicting business ties), we recommend that they create a set of binary edge attributes using `qap_setup`.
Let's use `qap_run` using the default 500 permutations. While we can significantly decrease the number of permutations to allow for lower computation times, this may make confidence intervals in our results less interpretable. As far as variables go in this example, we'll include `same_sex`, `both_race_White`, and `abs_diff_grade`.
```{r qap_run}
faux_qap <- qap_run(net = faux_qap_setup$graph,
dependent = NULL,
variables = c("same_sex", "both_race_White", "abs_diff_grade"),
directed = FALSE,
family = "linear")
```
`qap_run` returns a list of two objects. The first (`covs_df`) is a data frame summarizing model results in a way resembling a traditional regression output. The second (`mods_df`) is a data frame providing the number of dyadic observations on which the model is computed.
```{r qap_results, eval = FALSE}
faux_qap$covs_df
```
```{r qap_results_kable, echo = FALSE}
knitr::kable(faux_qap$covs_df)
```
```{r mods_df, eval = FALSE}
faux_qap$mods_df
```
```{r mods_df_kable, echo = FALSE}
knitr::kable(faux_qap$mods_df)
```
Assuming a p-value of .1 or less indicates some statistical significance, our results here tell us that students are more likely to nominate students of the same sex as friends, holding all else constant. It is recommended that researchers apply the same model with different amounts of draws to confirm the confidence intervals associated with each variable.
The above example shows that setting up and using MRQAP in `ideanet` is fast and easy, allowing users to quickly explore a variety of model specifications with their own data.