Type: Package
Title: Dynamic Stochastic Block Models
Version: 0.8
Date: 2025-05-15
Maintainer: Vincent Miele <vincent.miele@cnrs.fr>
Description: Dynamic stochastic block model that combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time, developed in Matias and Miele (2016) <doi:10.1111/rssb.12200>.
Imports: Rcpp, parallel, RColorBrewer, grDevices, graphics, stats
LinkingTo: Rcpp
License: GPL-3
NeedsCompilation: yes
Packaged: 2025-05-21 13:57:01 UTC; vmiele
Author: Catherine Matias [aut], Vincent Miele [aut, cre]
Repository: CRAN
Date/Publication: 2025-05-26 08:30:02 UTC

Dynamic stochastic block model estimation

Description

Estimation of a model that combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time

Details

dynsbm is a R implementation of a model that combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time. It deals with binary or weighted dynamic/temporal/evolving networks (with discrete or continuous edges).

Author(s)

Authors: Catherine Matias, Vincent Miele

Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>

References

Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464

Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355


Heatmap plot of the reorganized adjacency matrices associated to a dynamic stochastic block model.

Description

Heatmap plot of the adjacency matrices with rows/columns reorganized according to the group membership associated to a dynamic stochastic block model.

Usage

adjacency.plot(dynsbm, Y, present=NULL, col=heat.colors(9))

Arguments

dynsbm

An object of class dynsbm retrieved with the function select.dynsbm.

Y

An object of class array of dimension (T x N x N) containing T adjacency matrices of size (N x N), where N is the number of nodes in the network and T is the number of time points.

present

NULL or an object of class matrix of size (N x T) containing the presence/absence (coded with 1/0 respectively) of each N nodes at each of the T time points. When set to NULL, this object is deduced from Y.

col

A list of colors such as that generated by rainbow, heat.colors, topo.colors, terrain.colors or similar functions.

Details

The T adjacency matrices are represented. The row/lines are reordered following the group membership (nodes of group 1 followed by nodes of group 2 and so on). Red lines correspond to group delineation.

The reordering is independent for each time step. The adjacency matrices do not contain the row/columns corresponding to absent nodes.

If dynsbm was estimated with edge.type=="binary", the matrices cells are colored in white for value O or in the first color of the col argument vector for value 1. If dynsbm was estimated with edge.type=="discrete" or edge.type=="continuous", the matrices cells are colored with a colored gradient for value >0.

Value

No return value, called for plotting.

Author(s)

Authors: Catherine Matias, Vincent Miele

Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>

References

Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464

Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355

Examples

####################
## 1 - binary case
data(simdataT5Q4N40binary)

## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40binary, 
				Qmin=1, Qmax=5, edge.type="binary", nstart=1)

## selection of Q=4
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
adjacency.plot(dynsbm, simdataT5Q4N40binary)

####################
## 2 - continuous case
data(simdataT5Q4N40continuous)

## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40continuous, 
				Qmin=1, Qmax=5, edge.type="continuous", nstart=1)
						
## selection of Q=4
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
adjacency.plot(dynsbm, simdataT5Q4N40continuous)

####################
## 3 - discrete case
data(simdataT5Q4N40discrete)

## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40discrete, 
				Qmin=1, Qmax=5, edge.type="discrete", K=4, nstart=1)
									
## selection of Q=4
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
adjacency.plot(dynsbm, simdataT5Q4N40discrete)

Dynamic contact network of ants

Description

Colonies of the ant Camponotus fellah were followed with a tracking system that monitored the individual positions over days of observations and dynamic social interactions were deduced from physical proximity. The data corresponds to a colony of N=152 ants observed during T=10 days. Edge weights were binned into K=3 categories corresponding to low, medium and high interaction intensity.

Usage

data(antsMersch)

Format

An array of size 10x152x152.

Source

http://datadryad.org/resource/doi:10.5061/dryad.8d8h7

References

Mersch, D. P., Crespi, A., Keller, L., Tracking individuals shows spatial fidelity is a key regulator of ant social organization, Science, 340(6136), 1090-1093 (2013) http://dx.doi.org/10.1126/science.1234316

Miele, V and Matias, C, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017)

Examples


data(antsMersch)

## better to use nstart>>1 starting points
## but estimation can be long;
## better to use nb.cores>1 cores
list.dynsbm <- select.dynsbm(antsMersch, 
				Qmin=1, Qmax=6, edge.type="discrete", K=3, 
				nstart=1, nb.cores=1) 
				

## selection of Q=3 with the 'elbow' method
dynsbm <- list.dynsbm[[3]]

## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, antsMersch)


Plot the connectivity characteristics between groups associated to a dynamic stochastic block model.

Description

Plot the connectivity characteristics between groups associated to a dynamic stochastic block model.

Usage

connectivity.plot(dynsbm, Y)

Arguments

dynsbm

An object of class dynsbm retrieved with the function select.dynsbm.

Y

An object of class array of dimension (T x N x N) containing T adjacency matrices of size (N x N), where N is the number of nodes in the network and T is the number of time points.

Details

Interaction presence and intensity between nodes in any of the groups to the others are represented in a QxQ matrix. The cell in line q/column l deals with the connectivity between groups q/l. Each cell contains a curve with T time points on the x-axis corresponding to the T proportions of present edges over all the possible edges, where Q is the number of groups and T is the number of time points, and If dynsbm was estimated with edge.type=="binary", the area below the curve is filled in light blue. If dynsbm was estimated with edge.type=="discrete", the area below the curve is divided into K areas corresponding to the proportion of edges with value 1 to K (the darker blue, the greater edge intensity). If dynsbm was estimated with edge.type=="continuous", the area below the curve is filled with a colored gradient representing the mean edge intensity (the darker blue, the greater).

Value

No return value, called for plotting.

Author(s)

Authors: Catherine Matias, Vincent Miele

Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>

References

Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464

Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355

Examples

####################
## 1 - binary case
data(simdataT5Q4N40binary)

## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40binary, 
				Qmin=1, Qmax=5, edge.type="binary", nstart=1)
				
## selection of Q=4
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40binary)

####################
## 2 - continuous case
data(simdataT5Q4N40continuous)

## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40continuous, 
				Qmin=1, Qmax=5, edge.type="continuous", nstart=1)
						
## selection of Q=4
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40continuous)

####################
## 3 - discrete case
data(simdataT5Q4N40discrete)

## estimation for Q=1..5 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40discrete, 
				Qmin=1, Qmax=5, edge.type="discrete", K=4, nstart=1)
									
## selection of Q=4
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40discrete)

Broadstone Stream seasonal food webs

Description

This dataset concerns the aquatic macro-invertebrate community of Broadstone Stream in south-east England. Six seasonal connectance food webs were recorded, one every two months from May 1996 to April 1997. We restricted here to simple presence/absence information on species (nodes) and binary feeding links (edges). This dataset forms a dynamic trophic network with T=6 snapshots (May, August, October, December 1996, February, April 1997). Five species were not sampled each month. Each netowrk is directed.

Usage

data(foodwebWoodward)

Format

An array of size 6x26x26.

Source

Table 2 of Woordward et al

References

Woodward, G., Speirs, D. C., Hildrew, A. G., Quantification and Resolution of a Complex, Size-Structured Food Web. In Food Webs: From Connectivity to Energetics, Vol. 36 of Advances in Ecological Research, pp. 85-135 Academic Press (2005) http://dx.doi.org/10.1016/S0065-2504(05)36002-8

Miele, V and Matias, C, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017)

Examples


data(foodwebWoodward)

## mandatory to use many nstart>>1 starting points
## but estimation can be long;
## better to use nb.cores>1 cores
list.dynsbm <- select.dynsbm(Y=foodwebWoodward$Y, 
				present=foodwebWoodward$present, 
				Qmin=1, Qmax=6, edge.type="binary",
				directed=TRUE, self.loop=TRUE, 
				nstart=200, nb.cores=1) 
				

## selection of Q=4 with the ICL method
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, foodwebWoodward$Y)


Dynamic stochastic block model estimation for different number of groups.

Description

Estimation of dynamic stockastic block models for different number of groups. Each model combines a stochastic block model (SBM) for its static part with independent Markov chains for the evolution of the nodes groups through time.

Usage

select.dynsbm(Y, present=NULL, Qmin, Qmax,
			edge.type=c("binary","discrete","continuous"), K=-1,
			directed=FALSE, self.loop=FALSE,
			nb.cores=1,
			iter.max=20, nstart=25, perturbation.rate=0.2,
                        fixed.param=FALSE, bipartition=NULL,
			plot=TRUE)

Arguments

Y

An object of class array of dimension (T x N x N) containing T adjacency matrices of size (N x N), where N is the number of nodes in the network and T is the number of time points.

present

NULL or an object of class matrix of size (N x T) containing the presence/absence (coded with 1/0 respectively) of each N nodes at each of the T time points. When set to NULL, this object is deduced from Y (see the "Details" section). Any node must be present at least once among the time points.

Qmin

Minimum number of groups >1.

Qmax

Maximum number of groups.

edge.type

Type of adjacency matrices. This should be (an unambiguous abbreviation of) one of binary, discrete or continuous. See the "Details" section.

K

Only if edge.type=="discrete". Number of non-zero discrete values (i.e. in {1,..,K}).

directed

If TRUE, the network is supposed to be directed (and therefore Y is supposed to be asymmetric).

self.loop

If TRUE, self-loops (edges from one node to the same node) are allowed and taken into acount in the estimation procedure.

nb.cores

Number of cores to use, i.e. how many child processes and how many threads will be run simultaneously during the initialization and the estimation steps respectively.

iter.max

Maximal number of algorithm iterations.

nstart

Number of starting points for the iterative estimation algorithm. See the "Details" section.

perturbation.rate

Rate of perturbation (in [0,1], see nstart) for the iterative estimation algorithm. This rate is the fraction of nodes for which its group is randomly shuffled.

fixed.param

If TRUE, the model parameters remain fixed and constant in time. By default, fixed.param is automatically set to TRUE in the bipartite case (i.e. bipartition is not NULL; see the "Details" section).

bipartition

NULL or a vector of size N specifying a node bipartition in the case of bipartite networks (see the "Details" section). Each element of this vector is set to 1 or 2 to specify the node belongs to the first or second set of nodes.

plot

Display a plot with the loglikelihood and the ICL criteria if edge.type=="binary"/"continuous". See the "Details" section.

Details

This function deals with binary or weighted dynamic/temporal/evolving networks (with discrete or continuous edges). The adjacency matrices must be coded with 0/1 in the binary case, with 0/y where y belongs to the set {1,..,K} in the discrete case or with 0/y where y is numeric, must be positive and is supposed to fit a gaussian mixture in the continuous case.

Presence/absence information allows to model node's arrival or departure, birth or death, or simply enables to specify missing data (as absent nodes). If this information is missing (NULL), the presence/absence is deduced automatically from Y by searching for nodes that do not participate in any edges (lines/columns of O in Y) and declaring them as absent. This function does not support the existence of nodes that are never present (error message in this case).

The estimation algorithm is iterative and rely on a starting point. Therefore, it is possible to start the algorithm many times with 'nstart' starting points. The first starting point is obtained with an ad-hoc use of the kmeans function. The follwing starting point are obtained by perturbating the first one (see perturbation.rate). The greater nstart, the more accurate the results.

To select the best number of groups, the "elbow" method consists in finding the point where the slope of the loglikelihood is decreasing (i.e. the loglikelihood is reaching a plateau). If edge.type=="binary", the ICL criteria (plotted in red) has to be used : the best number of nodes is supposed to maximize the ICL criteria.

This function has been extended to the case of bipartite networks. In this case, despite Y has to be of dimension (T x N x N), it is possible to give a bipartition of the nodes into two disjoint sets. For statistical reasons, fixed.param is automatically set to TRUE. Given the total number of groups Q between Qmin and Qmax, there is Q/2 groups for each set of nodes (when Q is odd, there is floor(Q/2)+1 groups for the largest set of nodes); however, there is no guaranty that the final groups are coherent with the bipartition, i.e. that any group is composed by nodes of one of the two sets (if not, a warning message is generated).

Value

Returns a list of dynsbm objects. Each object of class dynsbm is a list with the following components:

transition

The Markov chain transition matrix of size (Q x Q).

membership

An object of class matrix of size (N x T) containing the group membership estimated by MAP (>0, =0 for absent nodes).

beta

An object of class matrix of size (T x Q x Q) containing the sparsity parameters of the model.

gamma

Only if edge.type=="discrete". An object of class matrix of size (T x Q x Q x K) containing the model parameters.

mu/sigma

Only edge.type=="continuous"An object of class matrix of size (T x Q x Q) and a vector containing the model parameters.

loglikelihood

Completed data log-likelihood.

iter

Number of used algorithm iterations.

directed

Specifies whether the model is build for directed networks.

self.loop

Specifies whether the model allows self-loops.

Author(s)

Authors: Catherine Matias, Vincent Miele

Maintainer: Vincent Miele <vincent.miele@univ-lyon1.fr>

References

Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2017) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464

Vincent Miele and Catherine Matias, Revealing the hidden structure of dynamic ecological networks, Royal Society Open Science (2017) http://dx.doi.org/10.1098/rsos.170251 https://arxiv.org/abs/1701.01355

Examples

data(simdataT5Q4N40binary)

## estimation for Q=1..6 groups
## better to use nstart>1 starting points
## but estimation can take 1-2 minutes
list.dynsbm <- select.dynsbm(simdataT5Q4N40binary, 
				Qmin=1, Qmax=6, edge.type="binary", nstart=1)
				
## selection of Q=4
dynsbm <- list.dynsbm[[4]]

## plotting intra/inter connectivity patterns
connectivity.plot(dynsbm, simdataT5Q4N40binary)

Dataset generated by a dynamic stochastic block model - binary case

Description

5 snapshots of a network with 40 nodes with binary weights, generated by a dynamic stochastic block model with 4 groups.

Usage

data(simdataT5Q4N40binary)

Format

An array of size 5x40x40.


Dataset generated by a dynamic stochastic block model - continuous case

Description

5 snapshots of a network with 40 nodes with continuous weights, generated by a dynamic stochastic block model with 4 groups.

Usage

data(simdataT5Q4N40continuous)

Format

An array of size 5x40x40.


Dataset generated by a dynamic stochastic block model - discrete case

Description

5 snapshots of a network with 40 nodes with discrete weights (K=3 categories), generated by a dynamic stochastic block model with 4 groups.

Usage

data(simdataT5Q4N40discrete)

Format

An array of size 5x40x40.


High school dynamic contact networks in the PC class in 2011 - Sociopatterns collaboration

Description

The dataset consists in face-to-face encounters of high school students of a class from a French high school. In this class called 'PC', interactions were recorded during 4 days (Tuesday to Friday) in Dec. 2011. We kept only the 27 (out of 31) students that appear every day. Interaction times were aggregated by days to form a sequence of 4 different networks. These are undirected and weighted networks, the weight of an interaction between two individuals being the number of interactions between these 2 individuals divided by the number of time points for which at least two individuals interacted; we discretize these data into 3 bins.

We sincerely acknowledge the SocioPatterns collaboration for providing the dataset (http://www.sociopatterns.org)

Usage

data(sociopatternsPC2011)

Format

An array of size 4x27x27.

Source

http://www.sociopatterns.org/datasets/high-school-dynamic-contact-networks/

References

Julie Fournet and Alain Barrat, Contact patterns among high school students, PLoS ONE 9(9):e107878 (2014).

Catherine Matias and Vincent Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (2016) http://dx.doi.org/10.1111/rssb.12200 http://arxiv.org/abs/1506.07464

http://www.sociopatterns.org/