Achieving exact cluster recovery threshold via semidefinite programming

The binary symmetric stochastic block model deals with a random graph of n vertices partitioned into two equal-sized clusters, such that each pair of vertices is independently connected with probability p within clusters and q across clusters. In the asymptotic regime of p = a log n/n and q = b log n/n for fixed a, b, and n → ∞, we show that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one, resolving a conjecture of Abbe et al. Furthermore, we show that the semidefinite programming relaxation also achieves the optimal recovery threshold in the planted dense subgraph model containing a single cluster of size proportional to n.

[1]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[2]  David S. Johnson,et al.  Some Simplified NP-Complete Graph Problems , 1976, Theor. Comput. Sci..

[3]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[4]  Afonso S. Bandeira,et al.  Random Laplacian Matrices and Convex Relaxations , 2015, Found. Comput. Math..

[5]  Alexandre Proutière,et al.  Accurate Community Detection in the Stochastic Block Model via Spectral Algorithms , 2014, ArXiv.

[6]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[7]  Uriel Feige,et al.  Heuristics for Semirandom Graph Problems , 2001, J. Comput. Syst. Sci..

[8]  Bruce E. Hajek,et al.  Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions , 2015, IEEE Transactions on Information Theory.

[9]  Alexander S. Wein,et al.  A semidefinite program for unbalanced multisection in the stochastic block model , 2017, 2017 International Conference on Sampling Theory and Applications (SampTA).

[10]  D. Welsh,et al.  A Spectral Technique for Coloring Random 3-Colorable Graphs , 1994 .

[11]  Elizaveta Levina,et al.  On semidefinite relaxations for the block model , 2014, ArXiv.

[12]  Bruce E. Hajek,et al.  Semidefinite Programs for Exact Recovery of a Hidden Community , 2016, COLT.

[13]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[14]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming under the stochastic block model , 2015, 2015 49th Asilomar Conference on Signals, Systems and Computers.

[15]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[16]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[17]  Babak Hassibi,et al.  Graph Clustering With Missing Data: Convex Algorithms and Analysis , 2014, NIPS.

[18]  Stephen A. Vavasis,et al.  Nuclear norm minimization for the planted clique and biclique problems , 2009, Math. Program..

[19]  Roman Vershynin,et al.  Community detection in sparse networks via Grothendieck’s inequality , 2014, Probability Theory and Related Fields.

[20]  Andrea Montanari,et al.  Semidefinite programs on sparse random graphs and their application to community detection , 2015, STOC.

[21]  Bruce E. Hajek,et al.  Computational Lower Bounds for Community Detection on Random Graphs , 2014, COLT.

[22]  Babak Hassibi,et al.  Sharp performance bounds for graph clustering via convex optimization , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[24]  Laurent Massoulié,et al.  Reconstruction in the labeled stochastic block model , 2013, 2013 IEEE Information Theory Workshop (ITW).

[25]  Robert B. Ash,et al.  Information Theory , 2020, The SAGE International Encyclopedia of Mass Media and Society.

[26]  E. Arias-Castro,et al.  Community detection in dense random networks , 2014 .

[27]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[28]  Yoav Seginer,et al.  The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.

[29]  David P. Williamson,et al.  Improved approximation algorithms for MAX SAT , 2000, SODA '00.

[30]  Uriel Feige,et al.  Spectral techniques applied to sparse random graphs , 2005, Random Struct. Algorithms.

[31]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[32]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[33]  Alan M. Frieze,et al.  Improved Approximation Algorithms for MAX k-CUT and MAX BISECTION , 1995, IPCO.

[34]  Laurent Massoulié,et al.  Distributed user profiling via spectral methods , 2010, SIGMETRICS '10.

[35]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  Alexandra Kolla,et al.  Multisection in the Stochastic Block Model using Semidefinite Programming , 2015, ArXiv.

[37]  Andrea Montanari,et al.  Semidefinite Programs on Sparse Random Graphs , 2015, ArXiv.

[38]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[39]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[40]  Xiaodong Li,et al.  Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes , 2014, ArXiv.

[41]  Stephen A. Vavasis,et al.  Convex optimization for the planted k-disjoint-clique problem , 2010, Math. Program..

[42]  T. Tao Topics in Random Matrix Theory , 2012 .

[43]  R. Srikant,et al.  Jointly clustering rows and columns of binary matrices: algorithms and trade-offs , 2013, SIGMETRICS '14.

[44]  Laurent Massoulié,et al.  Distributed user profiling via spectral methods , 2014 .

[45]  Bruce E. Hajek,et al.  Achieving Exact Cluster Recovery Threshold via Semidefinite Programming , 2016, IEEE Trans. Inf. Theory.

[46]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[47]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 1999, Random Struct. Algorithms.

[48]  Babak Hassibi,et al.  Finding Dense Clusters via "Low Rank + Sparse" Decomposition , 2011, ArXiv.

[49]  E. Arias-Castro,et al.  Community Detection in Random Networks , 2013, 1302.7099.

[50]  Amit Singer,et al.  Decoding Binary Node Labels from Censored Edge Measurements: Phase Transition and Efficient Recovery , 2014, IEEE Transactions on Network Science and Engineering.

[51]  Alan M. Frieze,et al.  Improved approximation algorithms for MAXk-CUT and MAX BISECTION , 1995, Algorithmica.

[52]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[53]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[54]  Florent Krzakala,et al.  Spectral detection in the censored block model , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[55]  Bruce E. Hajek,et al.  Exact recovery threshold in the binary censored block model , 2015, 2015 IEEE Information Theory Workshop - Fall (ITW).

[56]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[58]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.

[59]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[60]  Laurent Massoulié,et al.  Community Detection in the Labelled Stochastic Block Model , 2012, ArXiv.