Achieving Exact Cluster Recovery Threshold via Semidefinite Programming

The binary symmetric stochastic block model deals with a random graph of n vertices partitioned into two equal-sized clusters, such that each pair of vertices is independently connected with probability p within clusters and q across clusters. In the asymptotic regime of p = a log n/n and q = b log n/n for fixed a, b, and n → ∞, we show that the semidefinite programming relaxation of the maximum likelihood estimator achieves the optimal threshold for exactly recovering the partition from the graph with probability tending to one, resolving a conjecture of Abbe et al. Furthermore, we show that the semidefinite programming relaxation also achieves the optimal recovery threshold in the planted dense subgraph model containing a single cluster of size proportional to n.

[1]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[2]  Alexander S. Wein,et al.  A semidefinite program for unbalanced multisection in the stochastic block model , 2015, 2017 International Conference on Sampling Theory and Applications (SampTA).

[3]  Alexandra Kolla,et al.  Multisection in the Stochastic Block Model using Semidefinite Programming , 2015, ArXiv.

[4]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[5]  Afonso S. Bandeira,et al.  Random Laplacian Matrices and Convex Relaxations , 2015, Found. Comput. Math..

[6]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[7]  Bruce E. Hajek,et al.  Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions , 2015, IEEE Transactions on Information Theory.

[8]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[9]  Elchanan Mossel,et al.  Consistency Thresholds for Binary Symmetric Block Models , 2014, ArXiv.

[10]  Bruce E. Hajek,et al.  Computational Lower Bounds for Community Detection on Random Graphs , 2014, COLT.

[11]  E. Arias-Castro,et al.  Community detection in dense random networks , 2014 .

[12]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[13]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[14]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[15]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[16]  E. Arias-Castro,et al.  Community Detection in Random Networks , 2013, 1302.7099.

[17]  S. Sanghavi,et al.  Improved Graph Clustering , 2012, IEEE Transactions on Information Theory.

[18]  T. Tao Topics in Random Matrix Theory , 2012 .

[19]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[20]  Elchanan Mossel,et al.  Stochastic Block Models and Reconstruction , 2012 .

[21]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Laurent Massoulié,et al.  Distributed user profiling via spectral methods , 2010, SIGMETRICS '10.

[23]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[24]  Andrzej Rucinski,et al.  Random Graphs , 2018, Foundations of Data Science.

[25]  Uriel Feige,et al.  Spectral techniques applied to sparse random graphs , 2005, Random Struct. Algorithms.

[26]  M. Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[27]  Uriel Feige,et al.  Heuristics for Semirandom Graph Problems , 2001, J. Comput. Syst. Sci..

[28]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[29]  Yoav Seginer,et al.  The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.

[30]  David P. Williamson,et al.  Improved approximation algorithms for MAX SAT , 2000, SODA '00.

[31]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 1999, Random Struct. Algorithms.

[32]  N. Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[33]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[34]  Alan M. Frieze,et al.  Improved approximation algorithms for MAXk-CUT and MAX BISECTION , 1995, Algorithmica.

[35]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[36]  David S. Johnson,et al.  Some Simplified NP-Complete Graph Problems , 1976, Theor. Comput. Sci..

[37]  Kevin P. Murphy Information theory , 1998 .

[38]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[39]  D. Welsh,et al.  A Spectral Technique for Coloring Random 3-Colorable Graphs , 1994 .