Community detection in sparse networks via Grothendieck’s inequality

We present a simple and flexible method to prove consistency of semidefinite optimization problems on random graphs. The method is based on Grothendieck’s inequality. Unlike the previous uses of this inequality that lead to constant relative accuracy, we achieve any given relative accuracy by leveraging randomness. We illustrate the method with the problem of community detection in sparse networks, those with bounded average degrees. We demonstrate that even in this regime, various simple and natural semidefinite programs can be used to recover the community structure up to an arbitrarily small fraction of misclassified vertices. The method is general; it can be applied to a variety of stochastic models of networks and semidefinite programs.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[3]  J. Lindenstrauss,et al.  Absolutely summing operators in Lp spaces and their applications , 1968 .

[4]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[5]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[6]  J. Kuelbs Probability on Banach spaces , 1978 .

[7]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[8]  Frank Thomson Leighton,et al.  Graph Bisection Algorithms with Good Average Case Behavior , 1984, FOCS.

[9]  G. Pisier Grothendieck’s Theorem , 1986 .

[10]  Ravi B. Boppana,et al.  Eigenvalues and graph bisection: An average-case analysis , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[11]  Andrei Z. Broder,et al.  On the second eigenvalue of random regular graphs , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[12]  Endre Szemerédi,et al.  On the second eigenvalue of random regular graphs , 1989, STOC '89.

[13]  Martin E. Dyer,et al.  The Solution of Some Random NP-Hard Problems in Polynomial Expected Time , 1989, J. Algorithms.

[14]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[15]  Alexander Schrijver,et al.  Cones of Matrices and Set-Functions and 0-1 Optimization , 1991, SIAM J. Optim..

[16]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[17]  D. Welsh,et al.  A Spectral Technique for Coloring Random 3-Colorable Graphs , 1994 .

[18]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[19]  A. Grothendieck Résumé de la théorie métrique des produits tensoriels topologiques , 1996 .

[20]  R. Bhatia Matrix Analysis , 1996 .

[21]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[22]  Y. Nesterov Semidefinite relaxation and nonconvex quadratic optimization , 1998 .

[23]  Noga Alon,et al.  Spectral Techniques in Graph Algorithms , 1998, LATIN.

[24]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[25]  S. Strogatz Exploring complex networks , 2001, Nature.

[26]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .

[27]  Noga Alon,et al.  Approximating the cut-norm via Grothendieck's inequality , 2004, STOC '04.

[28]  Uriel Feige,et al.  Spectral techniques applied to sparse random graphs , 2005, Random Struct. Algorithms.

[29]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[31]  Béla Bollobás,et al.  The phase transition in inhomogeneous random graphs , 2007, Random Struct. Algorithms.

[32]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[33]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[34]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[35]  Amin Coja-Oghlan,et al.  Graph Partitioning via Adaptive Spectral Techniques , 2009, Combinatorics, Probability and Computing.

[36]  Stephen A. Vavasis,et al.  Nuclear norm minimization for the planted clique and biclique problems , 2009, Math. Program..

[37]  Subhash Khot,et al.  Grothendieck‐Type Inequalities in Combinatorial Optimization , 2011, ArXiv.

[38]  G. Pisier Grothendieck's Theorem, past and present , 2011, 1101.4195.

[39]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[40]  Van H. Vu Singular vectors under random perturbation , 2011, Random Struct. Algorithms.

[41]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[42]  Babak Hassibi,et al.  Finding Dense Clusters via "Low Rank + Sparse" Decomposition , 2011, ArXiv.

[43]  Béla Bollobás,et al.  Random Graphs, Second Edition , 2001, Cambridge Studies in Advanced Mathematics.

[44]  Alain Celisse,et al.  Consistency of maximum-likelihood and variational estimators in the Stochastic Block Model , 2011, 1105.3288.

[45]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[46]  Mark Braverman,et al.  The Grothendieck Constant is Strictly Smaller than Krivine's Bound , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[47]  Djalil Chafaï,et al.  Interactions between compressed sensing, random matrices, and high dimensional geometry , 2012 .

[48]  Elchanan Mossel,et al.  Stochastic Block Models and Reconstruction , 2012 .

[49]  Xiangyu Chang,et al.  Asymptotic Normality of Maximum Likelihood and its Variational Approximation for Stochastic Blockmodels , 2012, ArXiv.

[50]  Gisèle Chaboudez Partition du sexe et de l'amour , 2012 .

[51]  Fan Chung Graham,et al.  Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model , 2012, COLT.

[52]  Raj Rao Nadakuditi,et al.  Graph spectra and the detectability of community structure in networks , 2012, Physical review letters.

[53]  A. Guionnet,et al.  Localization and delocalization of eigenvectors for heavy-tailed random matrices , 2012, 1201.1862.

[54]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[55]  Nir Ailon,et al.  Breaking the Small Cluster Barrier of Graph Clustering , 2013, ICML.

[56]  Alessandro Rinaldo,et al.  Consistency of Spectral Clustering in Sparse Stochastic Block Models , 2013 .

[57]  Laurent Massoulié,et al.  Reconstruction in the labeled stochastic block model , 2013, 2013 IEEE Information Theory Workshop (ITW).

[58]  Tai Qin,et al.  Regularized Spectral Clustering under the Degree-Corrected Stochastic Blockmodel , 2013, NIPS.

[59]  Peter J. Bickel,et al.  Pseudo-likelihood methods for community detection in large sparse networks , 2012, 1207.2340.

[60]  Elizaveta Levina,et al.  On semidefinite relaxations for the block model , 2014, ArXiv.

[61]  Yair Weiss,et al.  Belief Propagation , 2012, Encyclopedia of Social Network Analysis and Mining.

[62]  Xiaodong Li,et al.  Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes , 2014, ArXiv.

[63]  Elchanan Mossel,et al.  Belief propagation, robust reconstruction and optimal recovery of block models , 2013, COLT.

[64]  Bin Yu,et al.  Impact of regularization on spectral clustering , 2013, 2014 Information Theory and Applications Workshop (ITA).

[65]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[66]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[67]  Anup Rao,et al.  Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery , 2015, COLT.

[68]  Andrea Montanari,et al.  Semidefinite Programs on Sparse Random Graphs , 2015, ArXiv.

[69]  Laurent Massoulié,et al.  Non-backtracking Spectrum of Random Graphs: Community Detection and Non-regular Ramanujan Graphs , 2014, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[70]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[71]  Can M. Le,et al.  Sparse random graphs: regularization and concentration of the Laplacian , 2015, ArXiv.

[72]  Andrea Montanari,et al.  Semidefinite programs on sparse random graphs and their application to community detection , 2015, STOC.

[73]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[74]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.