Quantification of size and overlap of n-dimensional hypervolumes in R: dynRB tutorial

Robert R. Junker1,*, Jonas Kuppler1,2, Arne C.Bathke3, Manuela L. Schreyer3, Wolfgang Trutschnig3
1Department of Bioscience, University of Salzburg, Salzburg, Austria; 2Institute of Evolutionary Ecology and Conservation Genomics, University of Ulm, Ulm, Germany; 3Department for Mathematics, University of Salzburg, Salzburg, Austria

2022-12-07

Abstract

This tutorial demonstrates the use of the R package dynamic range boxes dynRB to quantify the size and overlap of n-dimensional hypervolumes. It provides information on formatting data for the use in dynRB, introduces the functions implemented in the package, explains the outputs, gives examples on how to use the output for follow-up analyses, and finally shows how to visualize the results. In what follows we set the function parameter steps = 51 in order to reduce build time of the vignette, however for real world application the default setting is recommended.

Required packages

The quantification of size and overlap of n-dimensional hypervolumes requires the package dynRB (a detailed description of the method can be found in Junker et al. 2016). Additionally, the packages ggplot2, reshape2, vegan, and RColorBrewer are required for follow-up analyses demonstrated in this tutorial.

library(dynRB)
library(ggplot2)
library(reshape2)
library(vegan)
## Warning: Paket 'vegan' wurde unter R Version 4.1.3 erstellt
## Warning: Paket 'permute' wurde unter R Version 4.1.3 erstellt
library(RColorBrewer)

Formatting data for use in dynRB

The grouping variable (e.g. species) is required to be in the first column of the data table, followed by two or more columns containing continuous variables representing the different dimensions of the n-dimensional hypervolume. Thus, each row contains the measurements of one unit of interest (e.g. an individual of a species) in column 2 and the following columns as well as the grouping variable for this unit (first column). dynRB does not require complete measurements of all continuous variables in each row; missing values (i.e. NA) are omitted during the analysis. The package dynRB provides an example data set “finch”“, which serves as a reference. The”finch” data set includes morphological measurements of Darwin finches Geospiza sp., which originates from Snodgrass and Heller (1904) and was extracted from the R package hypervolume (Blonder et al. 2014). It comprises quantitative measurements of nine traits characterizing five (sub-) species of finches, each trait was measured at least in 10 individuals per species (see also Junker et al. 2016).

data(finch)
head(finch)[,1:5]
##               Species BodyL WingL TailL BeakW
## 1 Geospiza_heliobates   123  72.0  48.5   6.5
## 2 Geospiza_heliobates   126  70.0  48.5   7.0
## 3 Geospiza_heliobates   133  71.5  45.0   6.5
## 4 Geospiza_heliobates   127  69.0  39.0   6.5
## 5 Geospiza_heliobates   122  71.0  42.0   6.7
## 6 Geospiza_heliobates   128  72.0  43.0   6.5

The finch data set contains measurements of nine traits of 146 bird individuals belonging to five species. For a weighted analysis putting more weight on some individuals than on others, these individuals need to appear in two or more rows of the data table (compare to Kuppler et al. 2017; Junker & Larue-Kontic 2018). For instance, in Junker & Larue-Kontic (2018) we quantified the size and overlap of the trait spaces occupied by plant communities with plant traits as dimensions. In order to consider the abundances of the plant species within each of the communities, trait-data of each species was inserted in a number of rows proportional to the abundance of the species in the community. Let us assume species A has an abundance of 10 individuals and species B 20 individuals in the same community; in this case trait-data of species A would appear in 10 rows and those of species B in 20 rows.

Quantification of size and overlap of n-dimensional hypervolumes

The main function of the package dynRB “dynRB_VPa” quantifies the size and the overlap of the n-dimensional hypervolumes of two or more objects (e.g. the five finch species in the example data).

r <- dynRB_VPa(finch, steps = 51)

Dynamic range boxes first calculates the size (vol) and overlap (port) individually for each dimension and then aggregates the dimensions in order to quantify the size and overlap of the n-dimensional hypervolumes. Both size and overlap are bounded between 0 and 1, with values near 0 indicating a small size and overlap and near 1 a large size and overlap. Three different aggregation methods are provided:

res <- r$result
res[1:6, 1:5]
##                                   V1                           V2  port_prod
## 1             Geospiza_fortis_fortis       Geospiza_fortis_fortis 1.00000000
## 2       Geospiza_fortis_platyrhyncha       Geospiza_fortis_fortis 0.04410616
## 3        Geospiza_fuliginosa_parvula       Geospiza_fortis_fortis 0.00000000
## 4                Geospiza_heliobates       Geospiza_fortis_fortis 0.00000000
## 5 Geospiza_prosthemelas_prosthemelas       Geospiza_fortis_fortis 0.00000000
## 6             Geospiza_fortis_fortis Geospiza_fortis_platyrhyncha 0.08855999
##   port_mean port_gmean
## 1 1.0000000  1.0000000
## 2 0.6944053  0.6802714
## 3 0.1963080  0.0000000
## 4 0.4433622  0.0000000
## 5 0.1493107  0.0000000
## 6 0.7359027  0.7403640

Columns V1 and V2 denote the species pair considered. Columns “port_prod”, “port_mean”, and “port_gmean” contain the overlap of the hypervolumes of the species given in the first two columns (considering the afore-mentioned aggregation methods). Thus, each of the three columns show the overlap between V1 and V2 expressed as the proportion of the hypervolume of V2 that is covered by the hypervolume of V1 (port(V1, V2)). Since the overlap is expressed as proportion it is dependent on the size of the hypervolumes and thus, port(V1, V2) is not the same as port(V2, V1) since overlaps are, by construction, asymmetric.

res[1:6, c(1:2, 6:11)]
##                                   V1                           V2  vol_V1_prod
## 1             Geospiza_fortis_fortis       Geospiza_fortis_fortis 1.061290e-03
## 2       Geospiza_fortis_platyrhyncha       Geospiza_fortis_fortis 5.164404e-04
## 3        Geospiza_fuliginosa_parvula       Geospiza_fortis_fortis 5.203791e-05
## 4                Geospiza_heliobates       Geospiza_fortis_fortis 9.560884e-06
## 5 Geospiza_prosthemelas_prosthemelas       Geospiza_fortis_fortis 7.760346e-07
## 6             Geospiza_fortis_fortis Geospiza_fortis_platyrhyncha 1.061290e-03
##   vol_V1_mean vol_V1_gmean  vol_V2_prod vol_V2_mean vol_V2_gmean
## 1   0.8048758    0.7948846 0.0010612901   0.8048758    0.7948846
## 2   0.7778813    0.7654443 0.0010612901   0.8048758    0.7948846
## 3   0.6547723    0.6182060 0.0010612901   0.8048758    0.7948846
## 4   0.6267104    0.5652103 0.0010612901   0.8048758    0.7948846
## 5   0.5135105    0.4626482 0.0010612901   0.8048758    0.7948846
## 6   0.8048758    0.7948846 0.0005164404   0.7778813    0.7654443

Colums “vol_V1_prod”, “vol_V1_mean”, and “vol_V1_gmean” give the size vol(V1) of the hypervolume of species V1 expressed by the different aggregation methods. Colums “vol_V2_prod”, “vol_V2_mean”, and “vol_V2_gmean” contain the size vol(V2) of the hypervolume of species V2.

Quantification of overlaps per dimension

The function of the package dynRB “dynRB_Pn” quantifies the overlap per dimension of the n-dimensional hypervolumes of two or more objects (e.g. the five finches in the example data).

r <- dynRB_Pn(finch, steps = 51)
head(r$result)
##                                   V1                           V2      BodyL
## 1             Geospiza_fortis_fortis       Geospiza_fortis_fortis 1.00000000
## 2       Geospiza_fortis_platyrhyncha       Geospiza_fortis_fortis 0.15455434
## 3        Geospiza_fuliginosa_parvula       Geospiza_fortis_fortis 0.09187942
## 4                Geospiza_heliobates       Geospiza_fortis_fortis 0.70355015
## 5 Geospiza_prosthemelas_prosthemelas       Geospiza_fortis_fortis 0.01645361
## 6             Geospiza_fortis_fortis Geospiza_fortis_platyrhyncha 0.17340410
##        WingL     TailL     BeakW     BeakH     LBeakL      UBeakL     N.UBkL
## 1 1.00000000 1.0000000 1.0000000 1.0000000 1.00000000 1.000000000 1.00000000
## 2 0.18269389 0.3426384 0.3333398 0.3366382 0.42303077 0.465403836 0.36385641
## 3 0.01300229 0.3515615 0.0000000 0.0000000 0.00000000 0.000000000 0.00000000
## 4 0.45710488 0.4338076 0.0000000 0.0000000 0.07695579 0.004675909 0.07333676
## 5 0.00000000 0.1921101 0.0000000 0.0000000 0.00000000 0.000000000 0.00000000
## 6 0.23381515 0.5714352 0.1905192 0.2458909 0.35882253 0.482817034 0.30222130
##      TarsusL
## 1 1.00000000
## 2 0.17045307
## 3 0.01851471
## 4 0.25649571
## 5 0.12335054
## 6 0.52400000

Columns V1 and V2 denote the species pair considered. The following columns give the overlap port(V1, V2) for each of the dimensions contained in the data set.

Quantification of sizes per dimension

The function of the package dynRB “dynRB_Vn” quantifies the size per dimension of the n-dimensional hypervolume for each of the objects (e.g. the five finches in the example data).

r <- dynRB_Vn(finch, steps = 51)
head(r$result)
##                                   V1     BodyL     WingL     TailL     BeakW
## 1             Geospiza_fortis_fortis 0.4578298 0.4856886 0.6311010 0.2540241
## 2       Geospiza_fortis_platyrhyncha 0.4944163 0.2912925 0.4614273 0.5948394
## 3        Geospiza_fuliginosa_parvula 0.2981312 0.2738864 0.4156767 0.1243964
## 4                Geospiza_heliobates 0.3937992 0.2246252 0.4721869 0.1089350
## 5 Geospiza_prosthemelas_prosthemelas 0.2296473 0.1025116 0.3895951 0.1204894
##        BeakH     LBeakL     UBeakL     N.UBkL   TarsusL
## 1 0.35198512 0.38801750 0.28363582 0.37446006 0.3493918
## 2 0.45852890 0.34963071 0.40996770 0.34505192 0.2178684
## 3 0.09713766 0.12829627 0.24842846 0.16645777 0.2500555
## 4 0.19324772 0.10498779 0.03589303 0.21861312 0.3174346
## 5 0.08206582 0.07151052 0.10153781 0.03592394 0.1711066

Column V1 denotes the species considered. The following columns give the size vol(V1) for each of the dimensions contained in the data set.

Transformation of the output (to matrix format)

Many follow-up analyses may require another data format as the one provided by dynRB. Often matrices are required with the groups as row and column names and the pairwise overlap as entries in the cells.

r <- dynRB_VPa(finch, steps = 51)
# aggregation method product  
om1 <- reshape(r$result[,1:3], direction="wide", idvar="V1", timevar="V2") 
#aggregation method mean  
om2 <- reshape(r$result[,c(1,2,4)], direction="wide", idvar="V1", timevar="V2") 
#aggregation method gmean  
om3 <- reshape(r$result[,c(1,2,5)], direction="wide", idvar="V1", timevar="V2") 
om1
##                                   V1 port_prod.Geospiza_fortis_fortis
## 1             Geospiza_fortis_fortis                       1.00000000
## 2       Geospiza_fortis_platyrhyncha                       0.04410616
## 3        Geospiza_fuliginosa_parvula                       0.00000000
## 4                Geospiza_heliobates                       0.00000000
## 5 Geospiza_prosthemelas_prosthemelas                       0.00000000
##   port_prod.Geospiza_fortis_platyrhyncha port_prod.Geospiza_fuliginosa_parvula
## 1                             0.08855999                                     0
## 2                             1.00000000                                     0
## 3                             0.00000000                                     1
## 4                             0.00000000                                     0
## 5                             0.00000000                                     0
##   port_prod.Geospiza_heliobates port_prod.Geospiza_prosthemelas_prosthemelas
## 1                             0                                            0
## 2                             0                                            0
## 3                             0                                            0
## 4                             1                                            0
## 5                             0                                            1

Each element in this matrix is identified by its row number, which corresponds to one particular group, and its column number, which corresponds to another group. For example, om[1,3] refers to the entry in the first row and third column of the table. Its interpretation is as follows. It is the proportion of the hypervolume of the third column group (Geospiza fortis platyrhyncha) that is overlapped by the hypervolume of the first row group (Geospiza fortis fortis), i.e. port(Geospiza fortis platyrhyncha, Geospiza fortis fortis).

Evaluation of the asymmetry in overlaps

One example to analyze the output of dynamic range boxes is to evaluate the asymmetry of overlaps using a Mantel test. Therefore, the lower half of the matrix generated from the output of dynRB needs to be correlated with the upper half of the same matrix.

r <- dynRB_VPa(finch, steps = 51)
#aggregation method mean  
om <- reshape(r$result[,c(1,2,4)], direction="wide", idvar="V1", timevar="V2") 
mantel(as.dist(om[2:ncol(om)]), as.dist(t(om[2:ncol(om)])), permutations = 1000)
plot(as.dist(om[2:ncol(om)]), as.dist(t(om[2:ncol(om)]))) 

The result of the Mantel test shows that overlaps port(V1, V2) and port(V2, V1) are well correlated, i.e. overlaps are largely symmetric.

Visualization of overlaps using heatmaps

Pairwise overlaps can be visualized as a heatmap (see examples in Junker et al. 2016; Kuppler et al. 2017; Junker & Larue-Kontic 2018).

r <- dynRB_VPa(finch, steps = 51)  #main function of dynRB to calculate size and overlap of hypervolumes
result <- r$result
Overlap <- as.numeric(ifelse(result$V1 == result$V2, "NA", result$port_prod))  
## Warning: NAs durch Umwandlung erzeugt
# 'result$port_prod' may be changed to 'result$port_mean' or 'result$port_gmean'
is.numeric(Overlap)
Result2 <- cbind(result, Overlap)
ggplot(Result2, aes(x = V1, y = V2)) +
  geom_tile(data = subset(Result2, !is.na(Overlap)), aes(fill = Overlap), color="black") +
  geom_tile(data = subset(Result2,  is.na(Overlap)), fill = "lightgrey", color="black")

In this example the hypervolumes of only one species pair do overlap, the remaining species do not share space in the hypervolume (see also Fig. 4 in Junker et al. 2016).
The following code will result in a similar heatmap with customized settings:

r <- dynRB_VPa(finch, steps = 51)  
theme_change <- theme(
  plot.background = element_blank(),
  panel.grid.minor = element_blank(),
  panel.grid.major = element_blank(),
  panel.background = element_blank(),
  panel.border = element_blank(),
  axis.line = element_blank(),
  axis.ticks = element_blank(),
  axis.text.x = element_text(colour="black", size = rel(1.5), angle=35, hjust = 1),
  axis.text.y = element_text(colour="black", size = rel(1.5)),
  axis.title.x = element_blank(),
  axis.title.y = element_blank()
)
result <- r$result
Overlap <- as.numeric(ifelse(result$V1 == result$V2, "NA", result$port_prod))  
## Warning: NAs durch Umwandlung erzeugt
# 'result$port_prod' may be changed to 'result$port_mean' or 'result$port_gmean'
is.numeric(Overlap)
Result2<-cbind(result, Overlap)
breaks <- seq(min(Overlap, na.rm=TRUE),max(Overlap, na.rm=TRUE),  
              by=round(max(Overlap, na.rm=TRUE)/10, digits=3))
col1 <- colorRampPalette(c("white", "navyblue")) #define color gradient
ggplot(Result2, aes(x = V1, y = V2)) +
  geom_tile(data = subset(Result2, !is.na(Overlap)), aes(fill = Overlap), color="black") +
  geom_tile(data = subset(Result2,  is.na(Overlap)), fill = "lightgrey", color="black") +
  scale_fill_gradientn(colours=col1(8), breaks=breaks, guide="colorbar",  
                       limits=c(min(Overlap, na.rm=TRUE),max(Overlap, na.rm=TRUE))) +
  theme_change

References

Blonder, B., Lamanna, C., Violle, C. & Enquist, B.J. (2014) The n-dimensional hypervolume. Global Ecology and Biogeography, 23, 595-609.

Junker, R.R., Kuppler, J., Bathke, A., Schreyer, M.L. & Trutschnig, W. (2016) Dynamic range boxes - A robust non-parametric approach to quantify the size and overlap of niches and trait-spaces in n-dimensional hypervolumes Methods in Ecology and Evolution, 7, 1503-1513.

Junker, R.R. & Larue-Kontic, A. (2018) Elevation predicts the functional composition of alpine plant communities based on vegetative traits, but not based on floral traits. Alpine Botany.

Kuppler, J., MK, H., Trutschnig, W., Bathke, A., Eiben, J., Daehler, C. & Junker, R. (2017) Exotic flower visitors exploit large floral trait spaces resulting in asymmetric resource partitioning with native visitors. Functional Ecology, 31, 2244-2254.

Snodgrass, R. & Heller, E. (1904) Papers from the Hopkins-Stanford Galapagos Expedition, 1898-99. XVI. Birds. Proceedings of the Washington Academy of Sciences, 5, 231-372.