Type: | Package |
Title: | Dynamic Stochastic Block Models |
Version: | 0.8 |
Date: | 2025-05-15 |
Maintainer: | Vincent Miele <vincent.miele@cnrs.fr> |
Description: | Dynamic stochastic block model that combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time, developed in Matias and Miele (2016) <doi:10.1111/rssb.12200>. |
Imports: | Rcpp, parallel, RColorBrewer, grDevices, graphics, stats |
LinkingTo: | Rcpp |
License: | GPL-3 |
NeedsCompilation: | yes |
Packaged: | 2025-05-21 13:57:01 UTC; vmiele |
Author: | Catherine Matias [aut], Vincent Miele [aut, cre] |
Repository: | CRAN |
Date/Publication: | 2025-05-26 08:30:02 UTC |
Dynamic stochastic block model estimation
Description
Estimation of a model that combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time
Details
dynsbm is a R implementation of a model that combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time. It deals with binary or weighted dynamic/temporal/evolving networks (with discrete or continuous edges).
Author(s)
Authors: Catherine Matias, Vincent Miele
Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>
References
Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464
Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355
Heatmap plot of the reorganized adjacency matrices associated to a dynamic stochastic block model.
Description
Heatmap plot of the adjacency matrices with rows/columns reorganized according to the group membership associated to a dynamic stochastic block model.
Usage
adjacency.plot(dynsbm, Y, present=NULL, col=heat.colors(9))
Arguments
dynsbm |
An object of class |
Y |
An object of class |
present |
|
col |
A list of colors such as that generated by |
Details
The T adjacency matrices are represented. The row/lines are reordered following the group membership (nodes of group 1 followed by nodes of group 2 and so on). Red lines correspond to group delineation.
The reordering is independent for each time step. The adjacency matrices do not contain the row/columns corresponding to absent nodes.
If dynsbm
was estimated with edge.type=="binary"
, the
matrices cells are colored in white for value O or in the first color of the
col
argument vector for value 1. If dynsbm
was estimated
with edge.type=="discrete"
or edge.type=="continuous"
, the
matrices cells are colored with a colored gradient for value >0.
Value
No return value, called for plotting.
Author(s)
Authors: Catherine Matias, Vincent Miele
Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>
References
Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464
Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355
Examples
####################
## 1 - binary case
data(simdataT5Q4N40binary)
## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40binary,
Qmin=1, Qmax=5, edge.type="binary", nstart=1)
## selection of Q=4
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
adjacency.plot(dynsbm, simdataT5Q4N40binary)
####################
## 2 - continuous case
data(simdataT5Q4N40continuous)
## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40continuous,
Qmin=1, Qmax=5, edge.type="continuous", nstart=1)
## selection of Q=4
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
adjacency.plot(dynsbm, simdataT5Q4N40continuous)
####################
## 3 - discrete case
data(simdataT5Q4N40discrete)
## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40discrete,
Qmin=1, Qmax=5, edge.type="discrete", K=4, nstart=1)
## selection of Q=4
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
adjacency.plot(dynsbm, simdataT5Q4N40discrete)
Dynamic contact network of ants
Description
Colonies of the ant Camponotus fellah were followed with a tracking system that monitored the individual positions over days of observations and dynamic social interactions were deduced from physical proximity. The data corresponds to a colony of N=152 ants observed during T=10 days. Edge weights were binned into K=3 categories corresponding to low, medium and high interaction intensity.
Usage
data(antsMersch)
Format
An array of size 10x152x152.
Source
http://datadryad.org/resource/doi:10.5061/dryad.8d8h7
References
Mersch, D. P., Crespi, A., Keller, L., Tracking individuals shows spatial fidelity is a key regulator of ant social organization, Science, 340(6136), 1090-1093 (2013) http://dx.doi.org/10.1126/science.1234316
Miele, V and Matias, C, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017)
Examples
data(antsMersch)
## better to use nstart>>1 starting points
## but estimation can be long;
## better to use nb.cores>1 cores
list.dynsbm <- select.dynsbm(antsMersch,
Qmin=1, Qmax=6, edge.type="discrete", K=3,
nstart=1, nb.cores=1)
## selection of Q=3 with the 'elbow' method
dynsbm <- list.dynsbm[[3]]
## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, antsMersch)
Plot the connectivity characteristics between groups associated to a dynamic stochastic block model.
Description
Plot the connectivity characteristics between groups associated to a dynamic stochastic block model.
Usage
connectivity.plot(dynsbm, Y)
Arguments
dynsbm |
An object of class |
Y |
An object of class |
Details
Interaction presence and intensity between nodes in any of the groups to the others are represented in a QxQ matrix.
The cell in line q/column l deals with the connectivity between groups q/l.
Each cell contains a curve with T time points on the x-axis corresponding to the T proportions of present edges over all the possible edges, where Q is the number of groups and T is the number of time points, and
If dynsbm
was estimated with edge.type=="binary"
, the area below the curve is filled in light blue.
If dynsbm
was estimated with edge.type=="discrete"
, the area below the curve is divided into K areas corresponding to the proportion of edges with value 1 to K (the darker blue, the greater edge intensity).
If dynsbm
was estimated with edge.type=="continuous"
, the area below the curve is filled with a colored gradient representing the mean edge intensity (the darker blue, the greater).
Value
No return value, called for plotting.
Author(s)
Authors: Catherine Matias, Vincent Miele
Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>
References
Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464
Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355
Examples
####################
## 1 - binary case
data(simdataT5Q4N40binary)
## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40binary,
Qmin=1, Qmax=5, edge.type="binary", nstart=1)
## selection of Q=4
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40binary)
####################
## 2 - continuous case
data(simdataT5Q4N40continuous)
## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40continuous,
Qmin=1, Qmax=5, edge.type="continuous", nstart=1)
## selection of Q=4
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40continuous)
####################
## 3 - discrete case
data(simdataT5Q4N40discrete)
## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40discrete,
Qmin=1, Qmax=5, edge.type="discrete", K=4, nstart=1)
## selection of Q=4
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40discrete)
Broadstone Stream seasonal food webs
Description
This dataset concerns the aquatic macro-invertebrate community of Broadstone Stream in south-east England. Six seasonal connectance food webs were recorded, one every two months from May 1996 to April 1997. We restricted here to simple presence/absence information on species (nodes) and binary feeding links (edges). This dataset forms a dynamic trophic network with T=6 snapshots (May, August, October, December 1996, February, April 1997). Five species were not sampled each month. Each netowrk is directed.
Usage
data(foodwebWoodward)
Format
An array of size 6x26x26.
Source
Table 2 of Woordward et al
References
Woodward, G., Speirs, D. C., Hildrew, A. G., Quantification and Resolution of a Complex, Size-Structured Food Web. In Food Webs: From Connectivity to Energetics, Vol. 36 of Advances in Ecological Research, pp. 85-135 Academic Press (2005) http://dx.doi.org/10.1016/S0065-2504(05)36002-8
Miele, V and Matias, C, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017)
Examples
data(foodwebWoodward)
## mandatory to use many nstart>>1 starting points
## but estimation can be long;
## better to use nb.cores>1 cores
list.dynsbm <- select.dynsbm(Y=foodwebWoodward$Y,
present=foodwebWoodward$present,
Qmin=1, Qmax=6, edge.type="binary",
directed=TRUE, self.loop=TRUE,
nstart=200, nb.cores=1)
## selection of Q=4 with the ICL method
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, foodwebWoodward$Y)
Dynamic stochastic block model estimation for different number of groups.
Description
Estimation of dynamic stockastic block models for different number of groups. Each model combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time.
Usage
select.dynsbm(Y, present=NULL, Qmin, Qmax,
edge.type=c("binary","discrete","continuous"), K=-1,
directed=FALSE, self.loop=FALSE,
nb.cores=1,
iter.max=20, nstart=25, perturbation.rate=0.2,
fixed.param=FALSE, bipartition=NULL,
plot=TRUE)
Arguments
Y |
An object of class |
present |
|
Qmin |
Minimum number of groups >1. |
Qmax |
Maximum number of groups. |
edge.type |
Type of adjacency matrices. This should be (an unambiguous abbreviation of) one of |
K |
Only if |
directed |
If |
self.loop |
If |
nb.cores |
Number of cores to use, i.e. how many child processes and how many threads will be run simultaneously during the initialization and the estimation steps respectively. |
iter.max |
Maximal number of algorithm iterations. |
nstart |
Number of starting points for the iterative estimation algorithm. See the "Details" section. |
perturbation.rate |
Rate of perturbation (in [0,1], see |
fixed.param |
If |
bipartition |
|
plot |
Display a plot with the loglikelihood and the ICL criteria if |
Details
This function deals with binary or weighted dynamic/temporal/evolving networks (with discrete or continuous edges).
The adjacency matrices must be coded with 0/1 in the binary
case,
with 0/y where y belongs to the set {1,..,K} in the discrete
case
or with 0/y where y is numeric, must be positive and is supposed to fit a gaussian mixture in
the continuous
case.
Presence/absence information allows to model node's arrival or
departure, birth or death, or simply enables to specify missing data
(as absent nodes). If this information is missing (NULL), the presence/absence is deduced automatically from Y
by searching for nodes that do not participate in any edges (lines/columns of O in Y
) and declaring them as absent.
This function does not support the existence of nodes that are never present (error message in this case).
The estimation algorithm is iterative and rely on a starting point. Therefore, it is possible to start the algorithm many times with 'nstart' starting points.
The first starting point is obtained with an ad-hoc use of the kmeans
function.
The follwing starting point are obtained by perturbating the first one (see perturbation.rate
).
The greater nstart
, the more accurate the results.
To select the best number of groups, the "elbow" method consists in finding the point where the slope of the loglikelihood is decreasing (i.e. the loglikelihood is reaching a plateau).
If edge.type=="binary"
, the ICL criteria (plotted in red) has
to be used : the best number of nodes is supposed to maximize the ICL
criteria.
This function has been extended to the case of bipartite networks. In
this case, despite Y
has to be of dimension (T x N x N), it is
possible to give a bipartition of the nodes into two disjoint
sets. For statistical reasons, fixed.param
is automatically set
to TRUE
. Given the total number of groups Q between Qmin and
Qmax, there is Q/2 groups for each set of nodes (when Q is odd,
there is floor(Q/2)+1 groups for the largest set of nodes);
however, there is no guaranty that the final groups are
coherent with the bipartition, i.e. that any group is composed by
nodes of one of the two sets (if not, a warning message is generated).
Value
Returns a list of dynsbm
objects. Each object of class dynsbm
is a list
with the following components:
transition |
The Markov chain transition |
membership |
An object of class |
beta |
An object of class |
gamma |
Only if |
mu/sigma |
Only |
loglikelihood |
Completed data log-likelihood. |
iter |
Number of used algorithm iterations. |
directed |
Specifies whether the model is build for directed networks. |
self.loop |
Specifies whether the model allows self-loops. |
Author(s)
Authors: Catherine Matias, Vincent Miele
Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>
References
Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464
Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355
Examples
data(simdataT5Q4N40binary)
## estimation for Q=1..6 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40binary,
Qmin=1, Qmax=6, edge.type="binary", nstart=1)
## selection of Q=4
dynsbm <- list.dynsbm[[4]]
## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40binary)
Dataset generated by a dynamic stochastic block model - binary case
Description
5 snapshots of a network with 40 nodes with binary weights, generated by a dynamic stochastic block model with 4 groups.
Usage
data(simdataT5Q4N40binary)
Format
An array of size 5x40x40.
Dataset generated by a dynamic stochastic block model - continuous case
Description
5 snapshots of a network with 40 nodes with continuous weights, generated by a dynamic stochastic block model with 4 groups.
Usage
data(simdataT5Q4N40continuous)
Format
An array of size 5x40x40.
Dataset generated by a dynamic stochastic block model - discrete case
Description
5 snapshots of a network with 40 nodes with discrete weights (K=3 categories), generated by a dynamic stochastic block model with 4 groups.
Usage
data(simdataT5Q4N40discrete)
Format
An array of size 5x40x40.
High school dynamic contact networks in the PC class in 2011 - Sociopatterns collaboration
Description
The dataset consists in face-to-face encounters of high school students of a class from a French high school. In this class called 'PC', interactions were recorded during 4 days (Tuesday to Friday) in Dec. 2011. We kept only the 27 (out of 31) students that appear every day. Interaction times were aggregated by days to form a sequence of 4 different networks. These are undirected and weighted networks, the weight of an interaction between two individuals being the number of interactions between these 2 individuals divided by the number of time points for which at least two individuals interacted; we discretize these data into 3 bins.
We sincerely acknowledge the SocioPatterns collaboration for providing the dataset (http://www.sociopatterns.org)
Usage
data(sociopatternsPC2011)
Format
An array of size 4x27x27.
Source
http://www.sociopatterns.org/datasets/high-school-dynamic-contact-networks/
References
Julie Fournet and Alain Barrat, Contact patterns among high school students, PLoS ONE 9(9):e107878 (2014).
Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2016) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464
http://www.sociopatterns.org/