Mixture Models and Networks -- Overview of Stochastic Blockmodelling

Mixture models are probabilistic models aimed at uncovering and representing latent subgroups within a population. In the realm of network data analysis, the latent subgroups of nodes are typically identified by their connectivity behaviour, with nodes behaving similarly belonging to the same community. In this context, mixture modelling is pursued through stochastic blockmodelling. We consider stochastic blockmodels and some of their variants and extensions from a mixture modelling perspective. We also survey some of the main classes of estimation methods available, and propose an alternative approach. In addition to the discussion of inferential properties and estimating procedures, we focus on the application of the models to several real-world network datasets, showcasing the advantages and pitfalls of different approaches.

[1]  Christian P. Robert,et al.  Handbook of Mixture Analysis , 2018 .

[2]  Julien Brailly,et al.  Exponential Random Graph Models for Social Networks , 2014 .

[3]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data: Methods and Models , 2009 .

[4]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[5]  Tiago P. Peixoto Nonparametric Bayesian inference of the microcanonical stochastic block model. , 2016, Physical review. E.

[6]  On Mixtures of Distributions: A Survey and Some New Results on Ranking and Selection , 1979 .

[7]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[8]  Herbert Robbins,et al.  Mixture of Distributions , 1948 .

[9]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[10]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[11]  Dankmar Böhning,et al.  Computer-Assisted Analysis of Mixtures and Applications: Meta-Analysis, Disease Mapping, and Others , 1999 .

[12]  I. C. Gormley,et al.  A mixture of experts latent position cluster model for social network data , 2010 .

[13]  G. Kauermann,et al.  Bayesian and Spline based Approaches for (EM based) Graphon Estimation. , 2019, 1903.06936.

[14]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[15]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[16]  Martina Morris,et al.  ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. , 2008, Journal of statistical software.

[17]  D. Rubin,et al.  Estimation and Hypothesis Testing in Finite Mixture Models , 1985 .

[18]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[19]  David R. Hunter,et al.  Model-Based Clustering of Large Networks , 2012, The annals of applied statistics.

[20]  Murray Aitkin,et al.  Variational algorithms for biclustering models , 2015, Comput. Stat. Data Anal..

[21]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[22]  Emmanuel Abbe,et al.  Community Detection and Stochastic Block Models , 2017, Found. Trends Commun. Inf. Theory.

[23]  Klaus Nordhausen,et al.  Statistical Analysis of Network Data with R , 2015 .

[24]  S. Boorman,et al.  Social structure from multiple networks: I , 1976 .

[25]  Edoardo M. Airoldi,et al.  Stochastic blockmodel approximation of a graphon: Theory and consistent estimation , 2013, NIPS.

[26]  Jean-Benoist Léger Blockmodels: A R-package for estimating in Latent Block Model and Stochastic Block Model, with various probability functions, with or without covariates , 2016, 1602.07587.

[27]  Stephen E. Fienberg,et al.  A Brief History of Statistical Models for Network Analysis and Open Challenges , 2012 .

[28]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[29]  Lingzhou Xue,et al.  Model-Based Clustering of Time-Evolving Networks through Temporal Exponential-Family Random Graph Models , 2017, J. Multivar. Anal..

[30]  Andrew G. Long,et al.  Alliance Treaty Obligations and Provisions, 1815-1944 , 2002 .

[31]  H. Simon,et al.  ON A CLASS OF SKEW DISTRIBUTION FUNCTIONS , 1955 .

[32]  P. Deb Finite Mixture Models , 2008 .

[33]  S. Boorman,et al.  Social Structure from Multiple Networks. II. Role Structures , 1976, American Journal of Sociology.

[34]  St'ephane Robin,et al.  Uncovering latent structure in valued graphs: A variational approach , 2010, 1011.1813.

[35]  Standard errors for EM estimates in generalized linear models with random effects. , 2000, Biometrics.

[36]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .

[37]  Hocine Cherifi,et al.  Community detection algorithm evaluation with ground-truth data , 2017, ArXiv.

[38]  Gesine Reinert,et al.  Efficient method for estimating the number of communities in a network , 2017, Physical review. E.

[39]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[40]  Christian Tallberg A BAYESIAN APPROACH TO MODELING STOCHASTIC BLOCKSTRUCTURES WITH COVARIATES , 2004 .

[41]  S. Boorman,et al.  Social Structure from Multiple Networks. I. Blockmodels of Roles and Positions , 1976, American Journal of Sociology.

[42]  Katja Markert,et al.  Learning Models for Object Recognition from Natural Language Descriptions , 2009, BMVC.

[43]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[44]  Jing Lei,et al.  Network Cross-Validation for Determining the Number of Communities in Network Data , 2014, 1411.1715.

[45]  M. Aitkin,et al.  Mixture Models, Outliers, and the EM Algorithm , 1980 .

[46]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[47]  Frank Thomson Leighton,et al.  Graph bisection algorithms with good average case behavior , 1984, Comb..

[48]  Stéphane Robin,et al.  Variational Bayes model averaging for graphon functions and motif frequencies inference in W-graph models , 2013, Statistics and Computing.

[49]  X ZhengAlice,et al.  A Survey of Statistical Network Models , 2010 .

[50]  A. Cohen,et al.  Finite Mixture Distributions , 1982 .

[51]  Albert-Lszl Barabsi,et al.  Network Science , 2016, Encyclopedia of Big Data.

[52]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[53]  David R. Hunter,et al.  mixtools: An R Package for Analyzing Mixture Models , 2009 .

[54]  ON MIXTURES OF DISTRIBUTIONS : A SURVEY AND SOME NEW RESULTS ON RANKING AND SELECTION , .

[55]  P. Bickel,et al.  Likelihood-based model selection for stochastic block models , 2015, 1502.02069.

[56]  Patrick J. Wolfe,et al.  Network histograms and universality of blockmodel approximation , 2013, Proceedings of the National Academy of Sciences.

[57]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[58]  Murray Aitkin,et al.  Statistical modelling of the group structure of social networks , 2014, Soc. Networks.

[59]  F. Leisch FlexMix: A general framework for finite mixture models and latent class regression in R , 2004 .

[60]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[61]  I. C. Gormley,et al.  Mixtures of Experts Models , 2018, 1806.08200.

[62]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[63]  M. Aitkin How many Components in a Finite Mixture , 2011 .

[64]  Reza Ebrahimpour,et al.  Mixture of experts: a literature survey , 2014, Artificial Intelligence Review.