Co-clustering of Nonsmooth Graphons

Performance bounds are given for exploratory co-clustering/ blockmodeling of bipartite graph data, where we assume the rows and columns of the data matrix are samples from an arbitrary population. This is equivalent to assuming that the data is generated from a nonsmooth graphon. It is shown that co-clusters found by any method can be extended to the row and column populations, or equivalently that the estimated blockmodel approximates a blocked version of the generative graphon, with estimation error bounded by $O_P(n^{-1/2})$. Analogous performance bounds are also given for degree-corrected blockmodels and random dot product graphs, with error rates depending on the dimensionality of the latent variable space.

[1]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[2]  Mark E. J. Newman,et al.  Spectral community detection in sparse networks , 2013, ArXiv.

[3]  Peter J. Bickel,et al.  Fitting community models to large sparse networks , 2012, ArXiv.

[4]  Patrick J. Wolfe,et al.  Co-clustering separately exchangeable network data , 2012, ArXiv.

[5]  Bin Yu,et al.  Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim , 2012, 1204.2296.

[6]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[7]  Elchanan Mossel,et al.  Spectral redemption in clustering sparse networks , 2013, Proceedings of the National Academy of Sciences.

[8]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[9]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[11]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[12]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Edoardo M. Airoldi,et al.  Stochastic blockmodel approximation of a graphon: Theory and consistent estimation , 2013, NIPS.

[14]  Patrick J. Wolfe,et al.  Network histograms and universality of blockmodel approximation , 2013, Proceedings of the National Academy of Sciences.

[15]  Xiaodong Li,et al.  Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes , 2014, ArXiv.

[16]  P. Bickel,et al.  The method of moments and degree distributions for network models , 2011, 1202.5101.

[17]  Jiashun Jin,et al.  Coauthorship and Citation Networks for Statisticians , 2014, ArXiv.

[18]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[19]  A. Tsybakov,et al.  Oracle inequalities for network models and sparse graphon estimation , 2015, 1507.04118.

[20]  L. Wasserman All of Nonparametric Statistics , 2005 .

[21]  Luc Devroye,et al.  On the Performance of Clustering in Hilbert Spaces , 2008, IEEE Transactions on Information Theory.

[22]  P. Latouche,et al.  Overlapping stochastic block models with application to the French political blogosphere , 2009, 0910.2098.

[23]  Mason A. Porter,et al.  Comparing Community Structure to Characteristics in Online Collegiate Social Networks , 2008, SIAM Rev..

[24]  Kim C. Border,et al.  Infinite Dimensional Analysis: A Hitchhiker’s Guide , 1994 .

[25]  Gábor Lugosi,et al.  Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[26]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[27]  Kim C. Border,et al.  Infinite dimensional analysis , 1994 .

[28]  Carey E. Priebe,et al.  A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs , 2011, 1108.2228.

[29]  Harrison H. Zhou,et al.  Rate-optimal graphon estimation , 2014, 1410.5837.

[30]  K. Ball CONVEX BODIES: THE BRUNN–MINKOWSKI THEORY , 1994 .

[31]  Ji Zhu,et al.  Consistency of community detection in networks under degree-corrected stochastic block models , 2011, 1110.3854.

[32]  Carey E. Priebe,et al.  Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs , 2012, 1207.6745.