Stochastic blockmodels with growing number of classes

We present asymptotic and finite-sample results on the use of stochastic blockmodels for the analysis of network data. We show that the fraction of misclassified network nodes converges in probability to zero under maximum likelihood fitting when the number of classes is allowed to grow as the root of the network size and the average network degree grows at least poly-logarithmically in this size. We also establish finite-sample confidence bounds on maximum-likelihood blockmodel parameter estimates from data comprising independent Bernoulli random variates; these results hold uniformly over class assignment. We provide simulations verifying the conditions sufficient for our results, and conclude by fitting a logit parameterization of a stochastic blockmodel with covariates to a network data example comprising self-reported school friendships, resulting in block estimates that reveal residual structure.

[1]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[3]  Tom A. B. Snijders,et al.  Discussion of M.S. Handcock, A.E. Raftery. and J.M. Tantrum , “Model-based clustering for social networks , 2007 .

[4]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[5]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[6]  Peter D. Hoff,et al.  Modeling homophily and stochastic equivalence in symmetric relational data , 2007, NIPS.

[7]  J. Doye,et al.  Thermodynamics of Community Structure , 2006, cond-mat/0610077.

[8]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[9]  Maureen T. Hallinan Comment on Holland and Leinhardt , 1972, American Journal of Sociology.

[10]  Mason A. Porter,et al.  Comparing Community Structure to Characteristics in Online Collegiate Social Networks , 2008, SIAM Rev..

[11]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[12]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[13]  Yuchung J. Wang,et al.  Stochastic Blockmodels for Directed Graphs , 1987 .

[14]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  H. White,et al.  “Structural Equivalence of Individuals in Social Networks” , 2022, The SAGE Encyclopedia of Research Design.

[16]  Stanley Wasserman,et al.  Statistical analysis of binary relational data: Parameter estimation , 1985 .

[17]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[18]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[19]  Peter D. Hoff,et al.  Multiplicative latent factor models for description and prediction of social networks , 2009, Comput. Math. Organ. Theory.

[20]  M. M. Meyer,et al.  Statistical Analysis of Multiple Sociometric Relations. , 1985 .

[21]  T. Vicsek,et al.  Community structure and ethnic preferences in school friendship networks , 2006, physics/0611268.

[22]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[23]  F. Chung,et al.  Complex Graphs and Networks , 2006 .

[24]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[25]  Matthew O. Jackson,et al.  Identifying Community Structures from Network Data via Maximum Likelihood Methods , 2009 .

[26]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[27]  P. Holland,et al.  An Exponential Family of Probability Distributions for Directed Graphs , 1981 .

[28]  Alain Celisse,et al.  Consistency of maximum-likelihood and variational estimators in the Stochastic Block Model , 2011, 1105.3288.

[29]  Edoardo M. Airoldi,et al.  A latent mixed membership model for relational data , 2005, LinkKDD '05.

[30]  St'ephane Robin,et al.  Uncovering latent structure in valued graphs: A variational approach , 2010, 1011.1813.

[31]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[32]  Alan M. Frieze,et al.  A general model of web graphs , 2003, Random Struct. Algorithms.

[33]  O. Kallenberg Probabilistic Symmetries and Invariance Principles , 2005 .

[34]  S. Goodreau,et al.  Birds of a feather, or friend of a friend? using exponential random graph models to investigate adolescent social networks* , 2009, Demography.

[35]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[36]  M. Newman,et al.  Robustness of community structure in networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.