--- title: "An Introduction to Estimating Joint Probability Models with `iglm`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{undirected iglm} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} options(rmarkdown.html_vignette.check_title = FALSE) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width = "100%" ) library(iglm) ``` ## Overview This vignette provides an introduction to the `iglm` package, which is designed for estimating joint probability models that incorporate network structures. The package allows users to analyze how individual attributes and network connections jointly influence outcomes of interest. ## Basic Usage To use the `iglm` package, you first need to load it into your R session ```{r} library(iglm) ``` Next, you can create a `iglm` object by specifying the network structure and the attributes of interest. Here is a simple example: ```{r} n_actors =100 attribute_info = rnorm(n_actors) attribute_cov = diag(attribute_info) edge_cov = outer(attribute_info, attribute_info, FUN = function(x,y){abs(x-y)}) set.seed(123) alpha = 0.3 block <- matrix(nrow = 50, ncol = 50, data = 1) neighborhood <- as.matrix(Matrix::bdiag(replicate(n_actors/50, block, simplify=FALSE))) overlapping_degree = 0.5 neighborhood = matrix(nrow = n_actors, ncol = n_actors, data = 0) block <- matrix(nrow = 5, ncol = 5, data = 0) size_neighborhood <- 5 size_overlap <- ceiling(size_neighborhood*overlapping_degree) end <- floor((n_actors-size_neighborhood)/size_overlap) for(i in 0:end){ neighborhood[(1+size_overlap*i):(size_neighborhood+size_overlap*i), (1+size_overlap*i):(size_neighborhood+size_overlap*i)] = 1 } neighborhood[(n_actors-size_neighborhood+1):(n_actors), (n_actors-size_neighborhood+1):(n_actors)] = 1 type_x <- "binomial" type_y <- "binomial" formula_beg = as.formula("xyz_obj ~ 1 ") formula_model = as.formula("xyz_object ~ 1 ") object = iglm.data(neighborhood = neighborhood, directed = F, type_x = type_x, type_y = type_y) ``` ## Model Specification You can specify a model formula that includes various network statistics and attribute effects. For example: ```{r} formula <- object ~ edges + attribute_y + attribute_x + popularity ``` To fully define the model, you need to set up a sampler for the MCMC estimation and set all necessary parameters: ```{r} # Parameters of edges(mode = "local"), attribute_y, and attribute_x gt_coef = c(3,-1,-1) # Parameters for popularity effect gt_coef_pop = c(rnorm(n = n_actors, -2, 1)) # Define the sampler sampler_tmp = sampler.iglm(n_burn_in = 100, n_simulation = 10, sampler.x = sampler.net_attr(n_proposals = n_actors*10,seed = 13), sampler.y = sampler.net_attr(n_proposals = n_actors*10, seed = 32), sampler.z = sampler.net_attr(n_proposals = sum(neighborhood>0)*10, seed = 134), init_empty = F) model_tmp_new <- iglm(formula = formula, coef = gt_coef, coef_popularity = gt_coef_pop, sampler = sampler_tmp, control = control.iglm(accelerated = F,max_it = 200, display_progress = F, var = T)) ``` ## Model Simulation Once you have specified a model, you can simulate new data based on the fitted parameters: ```{r} # Simulate new networks model_tmp_new$simulate() # Get the samples tmp <- model_tmp_new$get_samples() ``` ## Model Estimation You can estimate the model parameters using the `estimate` method: ```{r} # First set the first simulated network as the target for estimation model_tmp_new$set_target(tmp[[1]]) model_tmp_new$estimate() model_tmp_new$iglm.data$degree_distribution(plot = TRUE) ``` ## Model Assessment After estimation, you can assess the model fit using various diagnostics: ```{r} model_tmp_new$model_assessment(formula = ~ degree_distribution + geodesic_distances_distribution + edgewise_shared_partner_distribution + mcmc_diagnostics) model_tmp_new$results$plot(model_assessment = T) ```