The recommended workflow of DImulti() analyses

library(DImodelsMulti)
library(dplyr)
library(nlme)
library(ggplot2)

This vignette aims to give a rough outline for the process to be followed when examining multivariate and/or repeated measures BEF relationship study data using the DImulti() function from the DImodelsMulti R package.

We will use the dataset simMVRM, which is included in the package, to illustrate the workflow of the package.

Examining the data

We always start by looking at our data using View() or head(), to make sure it is as expected and includes all required information.

head(simMVRM)

#>   plot p1 p2 p3 p4         Y1          Y2        Y3 time
#> 1    1  1  0  0  0 -3.2201614 -0.28424570 4.0353997    1
#> 2    2  1  0  0  0  0.2166701  0.90917719 0.1719544    1
#> 3    3  1  0  0  0 -2.1709989  0.04832118 0.6787839    1
#> 4    4  0  1  0  0  5.3908779  4.08309086 6.5332521    1
#> 5    5  0  1  0  0  5.2733174  4.29488262 6.2761877    1
#> 6    6  0  1  0  0  4.1985826  3.57457447 7.0207313    1

We can then look at some summarising metrics, such as the mean of each ecosystem function, separated by time as below, which can help us know what to expect in the analysis, in this case we see that ecosystem functioning improves at time point 2 for the average multifunctionality index, MFindex, but the function <span class”R”>Y2 performs better at time 1.

simMVRM_group <-  dplyr::summarise(dplyr::group_by(simMVRM, time),
                                   Y1 = mean(Y1),
                                   Y2 = mean(Y2),
                                   Y3 = mean(Y3),
                                   MFindex = mean(Y1 + Y2 + Y3))
simMVRM_group

#> # A tibble: 2 × 5
#>   time     Y1    Y2    Y3 MFindex
#>         
#> 1 1      2.21  3.07  5.53    10.8
#> 2 2     10.3   2.11  7.90    20.3

We can also produce plots to the same effect.

hist(simMVRM[which(simMVRM$time == 1), ]$Y1, main = "Y1 at time 1", xlab = "Y1")

Fitting the first model

We use the main function of the DImodelsMulti R package, DImulti(), to fit a repeated measures Diversity-Interactions model to the data.

It is recommended to begin with the simplest model options and increase complexity as model selection/comparisons permit. The simplest model available to be fit by the package is the structural model, “STR”, which only includes an intercept for each time and ecosystem function. \[ y_{kmt} = \beta_{0kt} + \epsilon_{kmt} \]

modelSTR <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "STR", method = "ML")

As our dataset is multivariate and in a wide format, we specify our response columns through the parameter y, “NA” through the first index of the vector eco_func, and select the unstructured (“UN”) autocorrelation structure for our ecosystem functions through the second index of eco_func. The data also contains multiple time points, so we pass the column holding the time point indicator through the parameter time, along with the chosen autocorrelation structure, in this case we use compound symmetry, “CS”. To facilitate grouping the recorded responses, a column index for the unique identifier of the experimental units is passed through unit_IDs. We choose to use the maximum likelihood (“ML”) estimation method as we intend to compare models with differing fixed effects.

We can use print() or summary() to view information on the model in our console.

print(modelSTR)

#> Note: 
#> Method Used = ML 
#> Correlation Structure Used =

#> UN (`?nlme::corSymm()`) @ CS (`?nlme::corCompSymm()`)

#> 
#>  Structure Model
#> Theta value(s) = 1
#> 
#> Generalized least squares fit by maximum likelihood
#>   Model: value ~ 0 + func:time 
#>       AIC       BIC    logLik 
#> 10020.215 10059.477 -5003.108 
#> 
#>  Multivariate Correlation Structure: General
#>  Formula: ~0 | plot 
#>  Parameter estimate(s):
#>  Correlation: 
#>   1     2    
#> 2 0.603      
#> 3 0.031 0.064
#> 
#>  Repeated Measure Correlation Structure: Compound symmetry
#>  Formula: ~0 | plot 
#>  Parameter estimate(s):
#>       Rho 
#> 0.3287314 
#> 
#> 
#> Table: Fixed Effect Coefficients
#> 
#>                Beta      Std. Error   t-value   p-value      Signif 
#> -------------  --------  -----------  --------  -----------  -------
#> funcY1:time1   +2.210    0.176        12.585    5.172e-35    ***    
#> funcY2:time1   +3.068    0.176        17.474    9.193e-64    ***    
#> funcY3:time1   +5.530    0.176        31.492    2.749e-177   ***    
#> funcY1:time2   +10.287   0.176        58.583    0            ***    
#> funcY2:time2   +2.106    0.176        11.992    4.702e-32    ***    
#> funcY3:time2   +7.900    0.176        44.989    2.22e-306    ***    
#> 
#> Signif codes: 0-0.001 '***', 0.001-0.01 '**', 0.01-0.05 '*', 0.05-0.1 '+', 0.1-1.0 ' '
#> 
#> Degrees of freedom: 2016 total; 2010 residual
#> Residual standard error: 3.213922 
#> 
#> $Multivariate
#> Marginal variance covariance matrix
#>        [,1]    [,2]    [,3]
#> [1,] 7.1668 3.59190 0.20620
#> [2,] 3.5919 4.94460 0.34982
#> [3,] 0.2062 0.34982 6.09090
#>   Standard Deviations: 2.6771 2.2236 2.468 
#> 
#> $`Repeated Measure`
#> Marginal variance covariance matrix
#>         [,1]    [,2]
#> [1,] 13.5390  4.4509
#> [2,]  4.4509 13.5390
#>   Standard Deviations: 3.6796 3.6796 
#> 
#> $Combined
#> Marginal variance covariance matrix
#>          Y1:1     Y1:2     Y2:1     Y2:2     Y3:1     Y3:2
#> Y1:1 10.32900  3.39560  6.23260  2.04890  0.32238  0.10598
#> Y1:2  3.39560 10.32900  2.04890  6.23260  0.10598  0.32238
#> Y2:1  6.23260  2.04890 10.32900  3.39560  0.65843  0.21645
#> Y2:2  2.04890  6.23260  3.39560 10.32900  0.21645  0.65843
#> Y3:1  0.32238  0.10598  0.65843  0.21645 10.32900  3.39560
#> Y3:2  0.10598  0.32238  0.21645  0.65843  3.39560 10.32900
#>   Standard Deviations: 3.2139 3.2139 3.2139 3.2139 3.2139 3.2139

From this evaluation, we can see that each of our coefficients are significant at an alpha significance level of 0.05 ($\alpha = 0.05$). We can also see the variance covariance matrices for our repeated measures and multiple ecosystem functions, along with the combined matrix, from which we can see the estimated strength and direction of the covarying relationships between response types.

Fitting a Diversity-Interactions model

The next model to be fit is the simplest Diversity-Interactions (DI) model available in the package, the identity (“ID”) structure. \[ y_{kmt} = \sum^{S}_{i=1}{\beta_{ikt} p_{im}} + \epsilon_{kmt} \] The DI modelling framework assumes that the main driver behind changes in ecosystem functioning is the initial relative abundance/proportion of the species.To reflect this, the intercept from the structural model we fit previously is replaced by the initial proportion of the species in the study, each with their own $\beta$ coefficient. These fixed effects form a simplex space. The only code that we need to change for this model is the DImodel tag, from “STR” to “ID”.

modelID <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "ID", method = "ML")

We can now view this model, as before. In this case, we simply extract the t-table from the model summary.

summary(modelID)$tTable

#>                         Value Std.Error    t-value       p-value
#> funcY1:time1:p1_ID -0.7089886 0.4850517 -1.4616764  1.439876e-01
#> funcY2:time1:p1_ID  2.2706049 0.4850517  4.6811605  3.044755e-06
#> funcY3:time1:p1_ID  2.4398359 0.4850517  5.0300533  5.343174e-07
#> funcY1:time2:p1_ID  8.3721996 0.4850517 17.2604275  2.537220e-62
#> funcY2:time2:p1_ID  3.7953423 0.4850517  7.8246139  8.196811e-15
#> funcY3:time2:p1_ID  4.0229523 0.4850517  8.2938629  1.992354e-16
#> funcY1:time1:p2_ID  5.3748998 0.4642573 11.5774151  4.797017e-30
#> funcY2:time1:p2_ID  5.9337382 0.4642573 12.7811407  5.208852e-36
#> funcY3:time1:p2_ID  7.9975235 0.4642573 17.2264884  4.229086e-62
#> funcY1:time2:p2_ID 12.1055179 0.4642573 26.0750174 3.365045e-129
#> funcY2:time2:p2_ID  3.7172258 0.4642573  8.0068220  1.981376e-15
#> funcY3:time2:p2_ID 10.7688254 0.4642573 23.1958114 1.458556e-105
#> funcY1:time1:p3_ID  3.3833228 0.4816376  7.0246237  2.934612e-12
#> funcY2:time1:p3_ID  1.2053951 0.4816376  2.5027016  1.240448e-02
#> funcY3:time1:p3_ID  4.8562397 0.4816376 10.0827675  2.366669e-23
#> funcY1:time2:p3_ID 12.8552387 0.4816376 26.6906886 1.958774e-134
#> funcY2:time2:p3_ID -2.1517318 0.4816376 -4.4675331  8.356377e-06
#> funcY3:time2:p3_ID  7.7120367 0.4816376 16.0121158  2.241213e-54
#> funcY1:time1:p4_ID -0.1039377 0.5451144 -0.1906714  8.488025e-01
#> funcY2:time1:p4_ID  2.5098803 0.5451144  4.6043189  4.399390e-06
#> funcY3:time1:p4_ID  6.6262284 0.5451144 12.1556668  7.555809e-33
#> funcY1:time2:p4_ID  6.8749751 0.5451144 12.6119869  3.845165e-35
#> funcY2:time2:p4_ID  3.3475340 0.5451144  6.1409757  9.873634e-10
#> funcY3:time2:p4_ID  8.7581311 0.5451144 16.0665942  1.030638e-54

The models within the DI modelling framework can be easily compared as they are hierarchical in nature, i.e. they are nested within one another. We can compare nested models using a likelihood ratio test, through anova(), or information criteria, such as AIC() or BIC() (the correct versions of which, AICc() or BICc() should be used in the cased of models with a large number of terms).

Image Credit: Kirwan et al. 2009

We choose to compare these models using a likelihood ratio test, where the null hypothesis states that the likelihoods of the two models do not significantly differ from one another, i.e. the extra terms in our larger model are not worth the added complexity.

anova(modelSTR, modelID)

#>          Model df      AIC       BIC    logLik   Test  L.Ratio p-value
#> modelSTR     1  7 10020.22 10059.477 -5003.108                        
#> modelID      2 25  9670.78  9811.002 -4810.390 1 vs 2 385.4351  <.0001

As the p-value of the test is less than our chosen $\alpha$ value of 0.05, we reject the null hypothesis and proceed with the more complex model. We can confirm this selection by choosing the model with the lower AIC and BIC values, which are also printed by the anova() call.

Fitting and comparing interactions structures

We continue to increase the complexity of the models by fitting the next interaction structure in the nesting series, the average interaction model, “AV”, a reparameterisation of the evenness model, “E”. Again, the only code change required is the change of the DImodel tag. We then compare the two models.

modelAV <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "AV", method = "ML")

coef(modelAV)

#> funcY1:time1:p1_ID funcY2:time1:p1_ID funcY3:time1:p1_ID funcY1:time2:p1_ID 
#>         -1.4073525          0.3535514          0.9463605          0.1419124 
#> funcY2:time2:p1_ID funcY3:time2:p1_ID funcY1:time1:p2_ID funcY2:time1:p2_ID 
#>          2.6296879         -0.5521430          4.7699282          4.2730527 
#> funcY3:time1:p2_ID funcY1:time2:p2_ID funcY2:time2:p2_ID funcY3:time2:p2_ID 
#>          6.7037708          4.9758685          2.7074546          6.8055585 
#> funcY1:time1:p3_ID funcY2:time1:p3_ID funcY3:time1:p3_ID funcY1:time2:p3_ID 
#>          2.6653214         -0.7655644          3.3207690          4.3935222 
#> funcY2:time2:p3_ID funcY3:time2:p3_ID funcY1:time1:p4_ID funcY2:time1:p4_ID 
#>         -3.3501635          3.0082933         -0.8997879          0.3252207 
#> funcY3:time1:p4_ID funcY1:time2:p4_ID funcY2:time2:p4_ID funcY3:time2:p4_ID 
#>          4.9242752         -2.5041979          2.0191631          3.5443872 
#>    funcY1:time1:AV    funcY2:time1:AV    funcY3:time1:AV    funcY1:time2:AV 
#>          2.8917739          7.9381038          6.1841587         34.0798392 
#>    funcY2:time2:AV    funcY3:time2:AV 
#>          4.8267227         18.9444801

anova(modelID, modelAV)

#>         Model df      AIC      BIC    logLik   Test  L.Ratio p-value
#> modelID     1 25 9670.780 9811.002 -4810.390                        
#> modelAV     2 31 7921.064 8094.939 -3929.532 1 vs 2 1761.716  <.0001

Once again, as our p-value < 0.05, we select the more complex model and continue increasing complexity.

Estimating theta

Now that we are fitting interactions, we may elect to estimate the non-linear parameter $\theta$, see vignette onTheta for a more in depth look at this process. We include the parameter estimate_theta with argument TRUE to have DImulti() automatically fit and compare different values of $\theta$, or if a user has a priori information on the nature of the term, they may pass values through the parameter theta. We can then test the two models against one another using information criteria, to see if the theta values significantly differ from 1.

modelAV_theta <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "AV", method = "ML",
                    estimate_theta = TRUE)

thetaVals <- modelAV_theta$theta
thetaVals

#>        Y1        Y2        Y3 
#> 1.0207718 0.8649974 0.9748904


AICc(modelAV)

#> [1] 7924.065

AICc(modelAV_theta)

#> [1] 7920.168

In this case, the inclusion of theta does improve the model.

The next models in the series are not nested within one another, so either can be used next but they cannot be compared to one another using a likelihood ratio test. As this dataset was simulated without any functional groupings, i.e. any groupings would be arbitrary, we choose to fit the additive interaction structure, “ADD”.

modelADD <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "ADD", method = "ML",
                    theta = thetaVals)

modelADD$coefficients

#>  funcY1:time1:p1_ID  funcY2:time1:p1_ID  funcY3:time1:p1_ID  funcY1:time2:p1_ID 
#>         -1.76823717          0.50808195          0.85988047          0.34006301 
#>  funcY2:time2:p1_ID  funcY3:time2:p1_ID  funcY1:time1:p2_ID  funcY2:time1:p2_ID 
#>          3.15759460         -0.45728573          5.05341103          4.51869398 
#>  funcY3:time1:p2_ID  funcY1:time2:p2_ID  funcY2:time2:p2_ID  funcY3:time2:p2_ID 
#>          6.55729976          4.94274021          2.71009286          6.71132736 
#>  funcY1:time1:p3_ID  funcY2:time1:p3_ID  funcY3:time1:p3_ID  funcY1:time2:p3_ID 
#>          2.75607701         -0.52669269          3.20352384          4.02915274 
#>  funcY2:time2:p3_ID  funcY3:time2:p3_ID  funcY1:time1:p4_ID  funcY2:time1:p4_ID 
#>         -3.63692794          2.84982288         -0.94384681          0.09319263 
#>  funcY3:time1:p4_ID  funcY1:time2:p4_ID  funcY2:time2:p4_ID  funcY3:time2:p4_ID 
#>          5.39678568         -2.68037228          1.91080012          3.98197795 
#> funcY1:time1:p1_add funcY2:time1:p1_add funcY3:time1:p1_add funcY1:time2:p1_add 
#>          2.87439115          2.76386215          3.23757687         16.85055512 
#> funcY2:time2:p1_add funcY3:time2:p1_add funcY1:time1:p2_add funcY2:time1:p2_add 
#>          0.36675191          8.74665768          0.35963222          2.46298179 
#> funcY3:time1:p2_add funcY1:time2:p2_add funcY2:time2:p2_add funcY3:time2:p2_add 
#>          3.56352962         17.63222788          1.88168350          9.50590985 
#> funcY1:time1:p3_add funcY2:time1:p3_add funcY3:time1:p3_add funcY1:time2:p3_add 
#>          1.16196161          2.59855157          3.29199535         18.96823267 
#> funcY2:time2:p3_add funcY3:time2:p3_add funcY1:time1:p4_add funcY2:time1:p4_add 
#>          2.80801331          9.61394103          1.80523459          3.75439826 
#> funcY3:time1:p4_add funcY1:time2:p4_add funcY2:time2:p4_add funcY3:time2:p4_add 
#>          1.42785123         18.23557620          2.06430493          7.63035189

anova(modelAV_theta, modelADD)

#>               Model df      AIC      BIC    logLik   Test  L.Ratio p-value
#> modelAV_theta     1 31 7917.167 8091.042 -3927.583                        
#> modelADD          2 49 7944.076 8218.911 -3923.038 1 vs 2 9.090901  0.9576

In this case, the p-value is greater than our chosen alpha level of 0.05, therefore we fail to reject the null hypothesis that the added complexity of the model does not significantly increase the model likelihood, and proceed in our analysis with the simpler model, modelAV.

As we do not need to increase the complexity of the model through the interaction structures any further, we turn to other options. It is at this stage that one could introduce treatment effects or environmental variables. As the simMVRM dataset does not contain any such information, example code is included below, where model modelAV_treat1 includes an intercept for treat for each level of ecosystem function and time point, while modelAV_treat2 crosses treat with each other fixed effect term.

modelAV_treat1 <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "ADD", method = "ML",
                    theta = thetaVals, extra_fixed = ~treat)

modelAV_treat2 <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "ADD", method = "ML",
                    theta = thetaVals, extra_fixed = ~1:treat)

The model could also alternatively be simplified here, by adding ID grouping, i.e. introducing functional redundancy within species ID effects. As simMVRM is simulated with no ID grouping in mind, any groups selected would be arbitrary and thus not recommended, however, example code for doing so is included below. These groupings do not affect the interaction term(s).

modelAV_ID <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "AV", method = "ML",
                    theta = thetaVals, ID = c("Group1", "Group1", "Group2", "Group2"))

summary(modelAV_ID)$tTable

#>                            Value Std.Error     t-value       p-value
#> funcY1:time1:Group1  2.046125801 0.4028518  5.07910310  4.144693e-07
#> funcY2:time1:Group1  2.569675414 0.3865981  6.64689141  3.849583e-11
#> funcY3:time1:Group1  4.082185237 0.4000006 10.20544719  7.138875e-24
#> funcY1:time2:Group1  2.811864764 0.4028518  6.97989880  4.004113e-12
#> funcY2:time2:Group1  2.551416081 0.3865981  6.59966062  5.263265e-11
#> funcY3:time2:Group1  3.514637655 0.4000006  8.78658045  3.250903e-18
#> funcY1:time1:Group2  1.471220766 0.4496161  3.27217067  1.085437e-03
#> funcY2:time1:Group2 -0.005768112 0.4333579 -0.01331027  9.893816e-01
#> funcY3:time1:Group2  4.328268798 0.4472334  9.67787384  1.110630e-21
#> funcY1:time2:Group2  1.585203853 0.4496161  3.52568266  4.319095e-04
#> funcY2:time2:Group2 -1.082110264 0.4333579 -2.49703586  1.260356e-02
#> funcY3:time2:Group2  3.676329091 0.4472334  8.22015702  3.612712e-16
#> funcY1:time1:AV      1.888198405 1.3666115  1.38166435  1.672293e-01
#> funcY2:time1:AV      5.509553872 0.9401894  5.86004666  5.399424e-09
#> funcY3:time1:AV      5.250615197 1.2372345  4.24383172  2.297904e-05
#> funcY1:time2:AV     34.748700051 1.3666115 25.42690423 8.432489e-124
#> funcY2:time2:AV      4.108335901 0.9401894  4.36968957  1.308024e-05
#> funcY3:time2:AV     17.003897471 1.2372345 13.74347133  3.893153e-41

anova(modelAV_ID, modelAV)

#>            Model df      AIC      BIC    logLik   Test  L.Ratio p-value
#> modelAV_ID     1 19 8928.581 9035.150 -4445.291                        
#> modelAV        2 31 7921.064 8094.939 -3929.532 1 vs 2 1031.517  <.0001

As the p-value is lower than 0.05, we reject the null hypothesis and continue with the more complex model, without any ID groupings.

If any errors are encountered in the process of fitting these models, please see vignette commonErrors for a guide.

Final model

As we have concluded our model selection process, the final step before interpretation is to refit the chosen model design using the method “REML”, for unbiased estimates.

modelFinal <- DImulti(y = c("Y1", "Y2", "Y3"), eco_func = c("NA", "UN"), time = c("time", "CS"),
                    unit_IDs = 1, prop = 2:5, data = simMVRM, DImodel = "AV", method = "REML",
                    theta = thetaVals)

summary(modelFinal)

#> Generalized least squares fit by REML
#>   Model: value ~ 0 + func:time:((p1_ID + p2_ID + p3_ID + p4_ID + AV)) 
#>   Data: data 
#>        AIC      BIC    logLik
#>   7927.124 8100.534 -3932.562
#> 
#> Correlation Structure: General
#>  Formula: ~0 | plot 
#>  Parameter estimate(s):
#>  Correlation: 
#>   1      2      3      4      5     
#> 2  0.318                            
#> 3  0.611  0.194                     
#> 4  0.194  0.611  0.318              
#> 5 -0.311 -0.099 -0.364 -0.116       
#> 6 -0.099 -0.311 -0.116 -0.364  0.318
#> 
#> Coefficients:
#>                       Value Std.Error  t-value p-value
#> funcY1:time1:p1_ID -1.39779 0.4013020 -3.48313  0.0005
#> funcY2:time1:p1_ID  0.46946 0.3919317  1.19781  0.2311
#> funcY3:time1:p1_ID  0.95611 0.3994207  2.39375  0.0168
#> funcY1:time2:p1_ID  0.06264 0.4013020  0.15610  0.8760
#> funcY2:time2:p1_ID  2.67366 0.3919317  6.82175  0.0000
#> funcY3:time2:p1_ID -0.50033 0.3994207 -1.25264  0.2105
#> funcY1:time1:p2_ID  4.77678 0.3719836 12.84137  0.0000
#> funcY2:time1:p2_ID  4.39765 0.3625691 12.12913  0.0000
#> funcY3:time1:p2_ID  6.71595 0.3698618 18.15799  0.0000
#> funcY1:time2:p2_ID  4.88988 0.3719836 13.14541  0.0000
#> funcY2:time2:p2_ID  2.76061 0.3625691  7.61403  0.0000
#> funcY3:time2:p2_ID  6.86181 0.3698618 18.55237  0.0000
#> funcY1:time1:p3_ID  2.67448 0.4036265  6.62613  0.0000
#> funcY2:time1:p3_ID -0.63478 0.3928039 -1.61603  0.1062
#> funcY3:time1:p3_ID  3.33256 0.4013398  8.30359  0.0000
#> funcY1:time2:p3_ID  4.30388 0.4036265 10.66302  0.0000
#> funcY2:time2:p3_ID -3.29772 0.3928039 -8.39534  0.0000
#> funcY3:time2:p3_ID  3.06694 0.4013398  7.64176  0.0000
#> funcY1:time1:p4_ID -0.88693 0.4528202 -1.95867  0.0503
#> funcY2:time1:p4_ID  0.42381 0.4448121  0.95278  0.3408
#> funcY3:time1:p4_ID  4.93027 0.4515411 10.91876  0.0000
#> funcY1:time2:p4_ID -2.57087 0.4528202 -5.67746  0.0000
#> funcY2:time2:p4_ID  2.04841 0.4448121  4.60512  0.0000
#> funcY3:time2:p4_ID  3.58783 0.4515411  7.94575  0.0000
#> funcY1:time1:AV     2.96892 1.0247246  2.89728  0.0038
#> funcY2:time1:AV     5.73299 0.7052083  8.12950  0.0000
#> funcY3:time1:AV     5.85220 0.9251268  6.32583  0.0000
#> funcY1:time2:AV    35.81655 1.0247246 34.95236  0.0000
#> funcY2:time2:AV     3.57028 0.7052083  5.06273  0.0000
#> funcY3:time2:AV    17.84102 0.9251268 19.28494  0.0000
#> 
#> Theta values: Y1:1.0208, Y2:0.865, Y3:0.9749
#> 
#> 
#>  Correlation: 
#>                    fY1:1:1 fY2:1:1 fY3:1:1 fY1:2:1 fY2:2:1 fY3:2:1 fY1:1:2
#> funcY2:time1:p1_ID  0.608                                                 
#> funcY3:time1:p1_ID -0.309  -0.362                                         
#> funcY1:time2:p1_ID  0.318   0.194  -0.098                                 
#> funcY2:time2:p1_ID  0.194   0.318  -0.115   0.608                         
#> funcY3:time2:p1_ID -0.098  -0.115   0.318  -0.309  -0.362                 
#> funcY1:time1:p2_ID  0.220   0.121  -0.066   0.070   0.039  -0.021         
#> funcY2:time1:p2_ID  0.118   0.180  -0.069   0.038   0.057  -0.022   0.608 
#> funcY3:time1:p2_ID -0.065  -0.070   0.212  -0.021  -0.022   0.067  -0.309 
#> funcY1:time2:p2_ID  0.070   0.039  -0.021   0.220   0.121  -0.066   0.318 
#> funcY2:time2:p2_ID  0.038   0.057  -0.022   0.118   0.180  -0.069   0.194 
#> funcY3:time2:p2_ID -0.021  -0.022   0.067  -0.065  -0.070   0.212  -0.098 
#> funcY1:time1:p3_ID  0.217   0.118  -0.065   0.069   0.038  -0.021   0.190 
#> funcY2:time1:p3_ID  0.117   0.177  -0.068   0.037   0.056  -0.022   0.101 
#> funcY3:time1:p3_ID -0.064  -0.069   0.209  -0.021  -0.022   0.067  -0.056 
#> funcY1:time2:p3_ID  0.069   0.038  -0.021   0.217   0.118  -0.065   0.061 
#> funcY2:time2:p3_ID  0.037   0.056  -0.022   0.117   0.177  -0.068   0.032 
#> funcY3:time2:p3_ID -0.021  -0.022   0.067  -0.064  -0.069   0.209  -0.018 
#> funcY1:time1:p4_ID  0.267   0.149  -0.080   0.085   0.048  -0.026   0.168 
#> funcY2:time1:p4_ID  0.152   0.236  -0.089   0.048   0.075  -0.028   0.092 
#> funcY3:time1:p4_ID -0.081  -0.088   0.261  -0.026  -0.028   0.083  -0.050 
#> funcY1:time2:p4_ID  0.085   0.048  -0.026   0.267   0.149  -0.080   0.054 
#> funcY2:time2:p4_ID  0.048   0.075  -0.028   0.152   0.236  -0.089   0.029 
#> funcY3:time2:p4_ID -0.026  -0.028   0.083  -0.081  -0.088   0.261  -0.016 
#> funcY1:time1:AV    -0.592  -0.341   0.180  -0.189  -0.109   0.057  -0.555 
#> funcY2:time1:AV    -0.357  -0.565   0.211  -0.114  -0.180   0.067  -0.335 
#> funcY3:time1:AV     0.181   0.203  -0.587   0.058   0.065  -0.187   0.170 
#> funcY1:time2:AV    -0.189  -0.109   0.057  -0.592  -0.341   0.180  -0.177 
#> funcY2:time2:AV    -0.114  -0.180   0.067  -0.357  -0.565   0.211  -0.107 
#> funcY3:time2:AV     0.058   0.065  -0.187   0.181   0.203  -0.587   0.054 
#>                    fY2:1:2 fY3:1:2 fY1:2:2 fY2:2:2 fY3:2:2 fY1:1:3 fY2:1:3
#> funcY2:time1:p1_ID                                                        
#> funcY3:time1:p1_ID                                                        
#> funcY1:time2:p1_ID                                                        
#> funcY2:time2:p1_ID                                                        
#> funcY3:time2:p1_ID                                                        
#> funcY1:time1:p2_ID                                                        
#> funcY2:time1:p2_ID                                                        
#> funcY3:time1:p2_ID -0.363                                                 
#> funcY1:time2:p2_ID  0.194  -0.098                                         
#> funcY2:time2:p2_ID  0.318  -0.115   0.608                                 
#> funcY3:time2:p2_ID -0.115   0.318  -0.309  -0.363                         
#> funcY1:time1:p3_ID  0.099  -0.056   0.061   0.032  -0.018                 
#> funcY2:time1:p3_ID  0.147  -0.059   0.032   0.047  -0.019   0.608         
#> funcY3:time1:p3_ID -0.058   0.181  -0.018  -0.018   0.058  -0.309  -0.362 
#> funcY1:time2:p3_ID  0.032  -0.018   0.190   0.099  -0.056   0.318   0.194 
#> funcY2:time2:p3_ID  0.047  -0.019   0.101   0.147  -0.059   0.194   0.318 
#> funcY3:time2:p3_ID -0.018   0.058  -0.056  -0.058   0.181  -0.098  -0.115 
#> funcY1:time1:p4_ID  0.086  -0.049   0.054   0.027  -0.016   0.253   0.139 
#> funcY2:time1:p4_ID  0.130  -0.053   0.029   0.041  -0.017   0.143   0.219 
#> funcY3:time1:p4_ID -0.050   0.161  -0.016  -0.016   0.051  -0.076  -0.082 
#> funcY1:time2:p4_ID  0.027  -0.016   0.168   0.086  -0.049   0.081   0.044 
#> funcY2:time2:p4_ID  0.041  -0.017   0.092   0.130  -0.053   0.046   0.070 
#> funcY3:time2:p4_ID -0.016   0.051  -0.050  -0.050   0.161  -0.024  -0.026 
#> funcY1:time1:AV    -0.314   0.168  -0.177  -0.100   0.053  -0.606  -0.348 
#> funcY2:time1:AV    -0.521   0.197  -0.107  -0.166   0.063  -0.366  -0.576 
#> funcY3:time1:AV     0.187  -0.548   0.054   0.060  -0.174   0.185   0.207 
#> funcY1:time2:AV    -0.100   0.053  -0.555  -0.314   0.168  -0.193  -0.111 
#> funcY2:time2:AV    -0.166   0.063  -0.335  -0.521   0.197  -0.116  -0.183 
#> funcY3:time2:AV     0.060  -0.174   0.170   0.187  -0.548   0.059   0.066 
#>                    fY3:1:3 fY1:2:3 fY2:2:3 fY3:2:3 fY1:1:4 fY2:1:4 fY3:1:4
#> funcY2:time1:p1_ID                                                        
#> funcY3:time1:p1_ID                                                        
#> funcY1:time2:p1_ID                                                        
#> funcY2:time2:p1_ID                                                        
#> funcY3:time2:p1_ID                                                        
#> funcY1:time1:p2_ID                                                        
#> funcY2:time1:p2_ID                                                        
#> funcY3:time1:p2_ID                                                        
#> funcY1:time2:p2_ID                                                        
#> funcY2:time2:p2_ID                                                        
#> funcY3:time2:p2_ID                                                        
#> funcY1:time1:p3_ID                                                        
#> funcY2:time1:p3_ID                                                        
#> funcY3:time1:p3_ID                                                        
#> funcY1:time2:p3_ID -0.098                                                 
#> funcY2:time2:p3_ID -0.115   0.608                                         
#> funcY3:time2:p3_ID  0.318  -0.309  -0.362                                 
#> funcY1:time1:p4_ID -0.076   0.081   0.044  -0.024                         
#> funcY2:time1:p4_ID -0.084   0.046   0.070  -0.027   0.608                 
#> funcY3:time1:p4_ID  0.247  -0.024  -0.026   0.079  -0.309  -0.362         
#> funcY1:time2:p4_ID -0.024   0.253   0.139  -0.076   0.318   0.194  -0.098 
#> funcY2:time2:p4_ID -0.027   0.143   0.219  -0.084   0.194   0.318  -0.115 
#> funcY3:time2:p4_ID  0.079  -0.076  -0.082   0.247  -0.098  -0.115   0.318 
#> funcY1:time1:AV     0.184  -0.193  -0.111   0.058  -0.597  -0.348   0.182 
#> funcY2:time1:AV     0.216  -0.116  -0.183   0.069  -0.360  -0.577   0.213 
#> funcY3:time1:AV    -0.600   0.059   0.066  -0.191   0.183   0.207  -0.594 
#> funcY1:time2:AV     0.058  -0.606  -0.348   0.184  -0.190  -0.111   0.058 
#> funcY2:time2:AV     0.069  -0.366  -0.576   0.216  -0.115  -0.184   0.068 
#> funcY3:time2:AV    -0.191   0.185   0.207  -0.600   0.058   0.066  -0.189 
#>                    fY1:2:4 fY2:2:4 fY3:2:4 fY1:1:A fY2:1:A fY3:1:A fY1:2:A
#> funcY2:time1:p1_ID                                                        
#> funcY3:time1:p1_ID                                                        
#> funcY1:time2:p1_ID                                                        
#> funcY2:time2:p1_ID                                                        
#> funcY3:time2:p1_ID                                                        
#> funcY1:time1:p2_ID                                                        
#> funcY2:time1:p2_ID                                                        
#> funcY3:time1:p2_ID                                                        
#> funcY1:time2:p2_ID                                                        
#> funcY2:time2:p2_ID                                                        
#> funcY3:time2:p2_ID                                                        
#> funcY1:time1:p3_ID                                                        
#> funcY2:time1:p3_ID                                                        
#> funcY3:time1:p3_ID                                                        
#> funcY1:time2:p3_ID                                                        
#> funcY2:time2:p3_ID                                                        
#> funcY3:time2:p3_ID                                                        
#> funcY1:time1:p4_ID                                                        
#> funcY2:time1:p4_ID                                                        
#> funcY3:time1:p4_ID                                                        
#> funcY1:time2:p4_ID                                                        
#> funcY2:time2:p4_ID  0.608                                                 
#> funcY3:time2:p4_ID -0.309  -0.362                                         
#> funcY1:time1:AV    -0.190  -0.111   0.058                                 
#> funcY2:time1:AV    -0.115  -0.184   0.068   0.603                         
#> funcY3:time1:AV     0.058   0.066  -0.189  -0.306  -0.360                 
#> funcY1:time2:AV    -0.597  -0.348   0.182   0.318   0.192  -0.097         
#> funcY2:time2:AV    -0.360  -0.577   0.213   0.192   0.318  -0.114   0.603 
#> funcY3:time2:AV     0.183   0.207  -0.594  -0.097  -0.114   0.318  -0.306 
#>                    fY2:2:A
#> funcY2:time1:p1_ID        
#> funcY3:time1:p1_ID        
#> funcY1:time2:p1_ID        
#> funcY2:time2:p1_ID        
#> funcY3:time2:p1_ID        
#> funcY1:time1:p2_ID        
#> funcY2:time1:p2_ID        
#> funcY3:time1:p2_ID        
#> funcY1:time2:p2_ID        
#> funcY2:time2:p2_ID        
#> funcY3:time2:p2_ID        
#> funcY1:time1:p3_ID        
#> funcY2:time1:p3_ID        
#> funcY3:time1:p3_ID        
#> funcY1:time2:p3_ID        
#> funcY2:time2:p3_ID        
#> funcY3:time2:p3_ID        
#> funcY1:time1:p4_ID        
#> funcY2:time1:p4_ID        
#> funcY3:time1:p4_ID        
#> funcY1:time2:p4_ID        
#> funcY2:time2:p4_ID        
#> funcY3:time2:p4_ID        
#> funcY1:time1:AV           
#> funcY2:time1:AV           
#> funcY3:time1:AV           
#> funcY1:time2:AV           
#> funcY2:time2:AV           
#> funcY3:time2:AV    -0.360 
#> 
#> Standardized residuals:
#>         Min          Q1         Med          Q3         Max 
#> -4.15977769 -0.38670617 -0.03622075  0.34007790  6.72648550 
#> 
#> Residual standard error: 1.949179 
#> Degrees of freedom: 2016 total; 1986 residual

Interpretation

The final and most important step in the workflow of any analysis is the interpretation of the final model.

We first look at the fixed effect coefficients, their significance and direction. For example, we can see that the identity effect of species p1 has a significant, negative effect on ecosystem function Y1 at time point 1, but a significant positive effect on ecosystem function Y3 at the same time point. We can also see that the interaction effect AV has a significant positive effect on all ecosystem functions at both time points. As this interaction term is maxmimised in an ecosystem design where all four species are represented evenly, it may be beneficial to include species p1 despite its negative effect on Y1, the presence of other species, which have a positive effect on this ecosystem function could also ‘make up’ for this fault of species p1 (assuming that our goal is the maximisation of all ecosystem functions).

summary(modelFinal)$tTable

#>                         Value Std.Error    t-value       p-value
#> funcY1:time1:p1_ID -1.3977875 0.4013020 -3.4831313  5.063174e-04
#> funcY2:time1:p1_ID  0.4694609 0.3919317  1.1978129  2.311328e-01
#> funcY3:time1:p1_ID  0.9561123 0.3994207  2.3937472  1.676953e-02
#> funcY1:time2:p1_ID  0.0626416 0.4013020  0.1560959  8.759733e-01
#> funcY2:time2:p1_ID  2.6736613 0.3919317  6.8217522  1.189860e-11
#> funcY3:time2:p1_ID -0.5003309 0.3994207 -1.2526413  2.104838e-01
#> funcY1:time1:p2_ID  4.7767781 0.3719836 12.8413683  2.566641e-36
#> funcY2:time1:p2_ID  4.3976500 0.3625691 12.1291347  1.029695e-32
#> funcY3:time1:p2_ID  6.7159482 0.3698618 18.1579939  2.731249e-68
#> funcY1:time2:p2_ID  4.8898755 0.3719836 13.1454069  6.589848e-38
#> funcY2:time2:p2_ID  2.7606110 0.3625691  7.6140262  4.081412e-14
#> funcY3:time2:p2_ID  6.8618119 0.3698618 18.5523674  5.503933e-71
#> funcY1:time1:p3_ID  2.6744802 0.4036265  6.6261271  4.424727e-11
#> funcY2:time1:p3_ID -0.6347815 0.3928039 -1.6160268  1.062473e-01
#> funcY3:time1:p3_ID  3.3325594 0.4013398  8.3035856  1.844114e-16
#> funcY1:time2:p3_ID  4.3038775 0.4036265 10.6630211  7.474747e-26
#> funcY2:time2:p3_ID -3.2977208 0.3928039 -8.3953373  8.707329e-17
#> funcY3:time2:p3_ID  3.0669414 0.4013398  7.6417573  3.311885e-14
#> funcY1:time1:p4_ID -0.8869260 0.4528202 -1.9586717  5.029102e-02
#> funcY2:time1:p4_ID  0.4238078 0.4448121  0.9527795  3.408177e-01
#> funcY3:time1:p4_ID  4.9302708 0.4515411 10.9187639  5.394009e-27
#> funcY1:time2:p4_ID -2.5708690 0.4528202 -5.6774613  1.568347e-08
#> funcY2:time2:p4_ID  2.0484108 0.4448121  4.6051152  4.383572e-06
#> funcY3:time2:p4_ID  3.5878308 0.4515411  7.9457454  3.204545e-15
#> funcY1:time1:AV     2.9689184 1.0247246  2.8972841  3.805312e-03
#> funcY2:time1:AV     5.7329924 0.7052083  8.1295019  7.502542e-16
#> funcY3:time1:AV     5.8521952 0.9251268  6.3258300  3.103520e-10
#> funcY1:time2:AV    35.8165488 1.0247246 34.9523649 5.122576e-209
#> funcY2:time2:AV     3.5702802 0.7052083  5.0627312  4.514504e-07
#> funcY3:time2:AV    17.8410157 0.9251268 19.2849401  4.234834e-76

Next we could examine the variance covariance matrices of the model. These can be extracted from the model by either using the function getVarCov() from the R package nlme or by using the $ operator (the components of a model in R can be viewed using View(model)).

nlme::getVarCov(modelFinal)

modelFinal$vcov

#> $Multivariate
#> Marginal variance covariance matrix
#>         [,1]    [,2]    [,3]
#> [1,]  4.9138  2.3576 -1.3793
#> [2,]  2.3576  3.0326 -1.2701
#> [3,] -1.3793 -1.2701  4.0112
#>   Standard Deviations: 2.2167 1.7414 2.0028 
#> 
#> $`Repeated Measure`
#> Marginal variance covariance matrix
#>        [,1]   [,2]
#> [1,] 4.6292 1.4740
#> [2,] 1.4740 4.6292
#>   Standard Deviations: 2.1516 2.1516 
#> 
#> $Combined
#> Marginal variance covariance matrix
#>          Y1:1     Y1:2     Y2:1     Y2:2     Y3:1     Y3:2
#> Y1:1  3.79930  1.20980  2.32040  0.73885 -1.18040 -0.37586
#> Y1:2  1.20980  3.79930  0.73885  2.32040 -0.37586 -1.18040
#> Y2:1  2.32040  0.73885  3.79930  1.20980 -1.38350 -0.44053
#> Y2:2  0.73885  2.32040  1.20980  3.79930 -0.44053 -1.38350
#> Y3:1 -1.18040 -0.37586 -1.38350 -0.44053  3.79930  1.20980
#> Y3:2 -0.37586 -1.18040 -0.44053 -1.38350  1.20980  3.79930
#>   Standard Deviations: 1.9492 1.9492 1.9492 1.9492 1.9492 1.9492

An example of an interpretation that we can make from these variance covariance matrices, is that the ecosystem functions Y1 and Y3 are negatively correlated, therefore would likely result in trade-offs when trying to estimated one over the other.

As multivariate repeated measures DI models can contain a large number of fixed effects, they can be difficult to interpret from. Our suggestions to circumvent this issue is to first predict from the model for community designs of interest, using the predict() function, as seen below, see vignette prediction for further details on this.

comms <- simMVRM[c(1, 4, 7, 10, 21), ]
print(comms)

#>    plot        p1        p2        p3 p4         Y1          Y2       Y3 time
#> 1     1 1.0000000 0.0000000 0.0000000  0 -3.2201614 -0.28424570 4.035400    1
#> 4     4 0.0000000 1.0000000 0.0000000  0  5.3908779  4.08309086 6.533252    1
#> 7     7 0.0000000 0.0000000 1.0000000  0  4.3064636  0.05409463 3.529927    1
#> 10   10 0.0000000 0.0000000 0.0000000  1 -0.9378472  0.30317782 6.102071    1
#> 21   21 0.3333333 0.3333333 0.3333333  0  1.7765221  2.99030627 5.650826    1



commPred <- predict(modelFinal, newdata = comms)
commPred

#>    plot     Yvalue Ytype
#> 1     1 -1.3977875  Y1:1
#> 2     1  0.4694609  Y2:1
#> 3     1  0.9561123  Y3:1
#> 4     4  4.7767781  Y1:1
#> 5     4  4.3976500  Y2:1
#> 6     4  6.7159482  Y3:1
#> 7     7  2.6744802  Y1:1
#> 8     7 -0.6347815  Y2:1
#> 9     7  3.3325594  Y3:1
#> 10   10 -0.8869260  Y1:1
#> 11   10  0.4238078  Y2:1
#> 12   10  4.9302708  Y3:1
#> 13   21  2.9633108  Y1:1
#> 14   21  3.2365157  Y2:1
#> 15   21  5.5319074  Y3:1

Then, with these predictions, we can employ summarising metrics, such as a multifunctionality index, to summarise our findings while maintaining the benefits of having modelled each ecosystem function separately but simultaneously, alla Suter et al 2021. We could also use these predictions to generate a plot/graph to display our findings, such as a histogram, grouped bar chart, or ternary diagram.

ggplot2::ggplot(commPred, ggplot2::aes(fill=Ytype, y=Yvalue, x=plot)) +
  ggplot2::geom_bar(position="dodge", stat="identity")

A new R package, DImodelsVis, is in development, which will make plotting DI models fit using the R packages DImodels or DImodelsMulti a simple process.

References

Suter, M., Huguenin-Elie, O. and Lüscher, A., 2021. Multispecies for multifunctions: combining four complementary species enhances multifunctionality of sown grassland. Scientific Reports, 11(1), p.3835.

Kirwan, L., Connolly, J., Finn, J.A., Brophy, C., Lüscher, A., Nyfeler, D. and Sebastià, M.T., 2009. Diversity-interaction modeling: estimating contributions of species identities and interactions to ecosystem function. Ecology, 90(8), pp.2032-2038.