Exponential Error Rates of SDP for Block Models: Beyond Grothendieck’s Inequality

In this paper, we consider the cluster estimation problem under the stochastic block model. We show that the semidefinite programming (SDP) formulation for this problem achieves an error rate that decays exponentially in the signal-to-noise ratio. The error bound implies weak recovery in the sparse graph regime with bounded expected degrees as well as exact recovery in the dense regime. An immediate corollary of our results yields error bounds under the censored block model. Moreover, these error bounds are robust, continuing to hold under heterogeneous edge probabilities and a form of the so-called monotone attack. Significantly, this error rate is achieved by the SDP solution itself without any further pre- or post-processing and improves upon existing polynomially decaying error bounds proved using the Grothendieck’s inequality. Our analysis builds on two key ingredients: 1) showing that the graph has a well-behaved spectrum, even in the sparse regime, after discounting an exponentially small number of edges and 2) an order-statistics argument that governs the final error rate. Both arguments highlight the implicit regularization effect of the SDP formulation.

[1]  Adel Javanmard,et al.  Performance of a community detection algorithm based on semidefinite programming , 2016, ArXiv.

[2]  Cristopher Moore,et al.  The Computer Science and Physics of Community Detection: Landscapes, Phase Transitions, and Hardness , 2017, Bull. EATCS.

[3]  J. Lindenstrauss,et al.  Absolutely summing operators in Lp spaces and their applications , 1968 .

[4]  Anup Rao,et al.  Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery , 2015, COLT.

[5]  Emmanuel Abbe,et al.  Recovering Communities in the General Stochastic Block Model Without Knowing the Parameters , 2015, NIPS.

[6]  Yu Lu,et al.  Statistical and Computational Guarantees of Lloyd's Algorithm and its Variants , 2016, ArXiv.

[7]  Elizaveta Levina,et al.  On semidefinite relaxations for the block model , 2014, ArXiv.

[8]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[9]  Emmanuel Abbe,et al.  Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic BP, and the information-computation gap , 2015, ArXiv.

[10]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[11]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[12]  Florent Krzakala,et al.  Spectral detection in the censored block model , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[13]  Bruce E. Hajek,et al.  Exact recovery threshold in the binary censored block model , 2015, 2015 IEEE Information Theory Workshop - Fall (ITW).

[14]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[15]  A. Grothendieck Résumé de la théorie métrique des produits tensoriels topologiques , 1996 .

[16]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[17]  Afonso S. Bandeira,et al.  Random Laplacian Matrices and Convex Relaxations , 2015, Found. Comput. Math..

[18]  Xiaodong Li,et al.  Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes , 2014, ArXiv.

[19]  Andrea Montanari,et al.  Semidefinite programs on sparse random graphs and their application to community detection , 2015, STOC.

[20]  Alexandre Proutière,et al.  Optimal Cluster Recovery in the Labeled Stochastic Block Model , 2015, NIPS.

[21]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[22]  Alan M. Frieze,et al.  Algorithmic theory of random graphs , 1997, Random Struct. Algorithms.

[23]  Alexandre Proutière,et al.  Accurate Community Detection in the Stochastic Block Model via Spectral Algorithms , 2014, ArXiv.

[24]  Ankur Moitra,et al.  How robust are reconstruction thresholds for community detection? , 2015, STOC.

[25]  Emmanuel Abbe,et al.  Community detection and the stochastic block model : recent developments , 2016 .

[26]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[27]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[28]  Babak Hassibi,et al.  Finding Dense Clusters via "Low Rank + Sparse" Decomposition , 2011, ArXiv.

[29]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[30]  Brendan P. W. Ames Guaranteed clustering and biclustering via semidefinite programming , 2012, Mathematical Programming.

[31]  Laurent Massoulié,et al.  Community Detection in the Labelled Stochastic Block Model , 2012, ArXiv.

[32]  P. Rigollet 18.S997: High Dimensional Statistics , 2015 .

[33]  Amit Singer,et al.  Decoding Binary Node Labels from Censored Edge Measurements: Phase Transition and Efficient Recovery , 2014, IEEE Transactions on Network Science and Engineering.

[34]  Stephen A. Vavasis,et al.  Convex optimization for the planted k-disjoint-clique problem , 2010, Math. Program..

[35]  Michael Krivelevich,et al.  Semirandom Models as Benchmarks for Coloring Algorithms , 2006, ANALCO.

[36]  Sudipto Guha,et al.  A constant-factor approximation algorithm for the k-median problem (extended abstract) , 1999, STOC '99.

[37]  Alexander S. Wein,et al.  A semidefinite program for unbalanced multisection in the stochastic block model , 2017, 2017 International Conference on Sampling Theory and Applications (SampTA).

[38]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[39]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[40]  A. Bandeira,et al.  Sharp nonasymptotic bounds on the norm of random matrices with independent entries , 2014, 1408.6185.

[41]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[42]  Adel Javanmard,et al.  Phase transitions in semidefinite relaxations , 2015, Proceedings of the National Academy of Sciences.

[43]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[44]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[45]  Béla Bollobás,et al.  Max Cut for Random Graphs with a Planted Partition , 2004, Combinatorics, Probability and Computing.

[46]  Xiaodong Li,et al.  Convexified Modularity Maximization for Degree-corrected Stochastic Block Models , 2015, The Annals of Statistics.

[47]  Roman Vershynin,et al.  Community detection in sparse networks via Grothendieck’s inequality , 2014, Probability Theory and Related Fields.

[48]  Aravindan Vijayaraghavan,et al.  Learning Communities in the Presence of Errors , 2015, COLT.

[49]  Uriel Feige,et al.  Heuristics for Semirandom Graph Problems , 2001, J. Comput. Syst. Sci..

[50]  Chao Gao,et al.  Community Detection in Degree-Corrected Block Models , 2016, The Annals of Statistics.

[51]  Chao Gao,et al.  Achieving Optimal Misclassification Proportion in Stochastic Block Models , 2015, J. Mach. Learn. Res..

[52]  Amin Coja-Oghlan Coloring Semirandom Graphs Optimally , 2004, ICALP.

[53]  Alexandra Kolla,et al.  Multisection in the Stochastic Block Model using Semidefinite Programming , 2015, ArXiv.

[54]  Anderson Y. Zhang,et al.  Minimax Rates of Community Detection in Stochastic Block Models , 2015, ArXiv.