Co-clustering separately exchangeable network data

This article establishes the performance of stochastic blockmodels in addressing the co-clustering problem of partitioning a binary array into subsets, assuming only that the data are generated by a nonparametric process satisfying the condition of separate exchangeability. We provide oracle inequalities with rate of convergence $\mathcal{O}_P(n^{-1/4})$ corresponding to profile likelihood maximization and mean-square error minimization, and show that the blockmodel can be interpreted in this setting as an optimal piecewise-constant approximation to the generative nonparametric model. We also show for large sample sizes that the detection of co-clusters in such data indicates with high probability the existence of co-clusters of equal size and asymptotically equivalent connectivity in the underlying generative process.

[1]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[2]  Ji Zhu,et al.  On Consistency of Community Detection in Networks , 2011, ArXiv.

[3]  Ji Zhu,et al.  Consistency of community detection in networks under degree-corrected stochastic block models , 2011, 1110.3854.

[4]  Noga Alon,et al.  Random sampling and approximation of MAX-CSPs , 2003, J. Comput. Syst. Sci..

[5]  Stephen E. Fienberg,et al.  A Brief History of Statistical Models for Network Analysis and Open Challenges , 2012 .

[6]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[7]  P. Bickel,et al.  The method of moments and degree distributions for network models , 2011, 1202.5101.

[8]  Carey E. Priebe,et al.  Consistent Adjacency-Spectral Partitioning for the Stochastic Block Model When the Model Parameters Are Unknown , 2012, SIAM J. Matrix Anal. Appl..

[9]  E. Levina,et al.  Community extraction for social networks , 2010, Proceedings of the National Academy of Sciences.

[10]  S. Fortunato,et al.  Resolution limit in community detection , 2006, Proceedings of the National Academy of Sciences.

[11]  R. Schneider Convex Bodies: The Brunn–Minkowski Theory: Minkowski addition , 1993 .

[12]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[13]  Edoardo M. Airoldi,et al.  Stochastic blockmodels with growing number of classes , 2010, Biometrika.

[14]  Peter D. Hoff,et al.  Multiplicative latent factor models for description and prediction of social networks , 2009, Comput. Math. Organ. Theory.

[15]  G. Lugosi,et al.  Ranking and empirical minimization of U-statistics , 2006, math/0603123.

[16]  V. Sós,et al.  Convergent Sequences of Dense Graphs I: Subgraph Frequencies, Metric Properties and Testing , 2007, math/0702004.

[17]  Jure Leskovec,et al.  Multiplicative Attribute Graph Model of Real-World Networks , 2010, Internet Math..

[18]  László Lovász,et al.  Graph limits and parameter testing , 2006, STOC '06.

[19]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[20]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[21]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[23]  Bin Yu,et al.  Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim , 2012, 1204.2296.

[24]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[25]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[26]  V. Sós,et al.  GRAPH LIMITS AND EXCHANGEABLE RANDOM GRAPHS , 2008 .

[27]  Gábor Lugosi,et al.  Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[28]  V. Sós,et al.  Convergent Sequences of Dense Graphs II. Multiway Cuts and Statistical Physics , 2012 .

[29]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[30]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .