Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions

Resolving a conjecture of Abbe, Bandeira, and Hall, the authors have recently shown that the semidefinite programming (SDP) relaxation of the maximum likelihood estimator achieves the sharp threshold for exactly recovering the community structure under the binary stochastic block model (SBM) of two equal-sized clusters. The same was shown for the case of a single cluster and outliers. Extending the proof techniques, in this paper, it is shown that SDP relaxations also achieve the sharp recovery threshold in the following cases: 1) binary SBM with two clusters of sizes proportional to network size but not necessarily equal; 2) SBM with a fixed number of equal-sized clusters; and 3) binary censored block model with the background graph being Erdös-Rényi. Furthermore, a sufficient condition is given for an SDP procedure to achieve exact recovery for the general case of a fixed number of clusters plus outliers. These results demonstrate the versatility of SDP relaxation as a simple, general purpose, computationally feasible methodology for community detection.

[1]  Babak Hassibi,et al.  Finding Dense Clusters via "Low Rank + Sparse" Decomposition , 2011, ArXiv.

[2]  E. Arias-Castro,et al.  Community Detection in Random Networks , 2013, 1302.7099.

[3]  Elizaveta Levina,et al.  On semidefinite relaxations for the block model , 2014, ArXiv.

[4]  Bruce E. Hajek,et al.  Semidefinite Programs for Exact Recovery of a Hidden Community , 2016, COLT.

[5]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[6]  Amit Singer,et al.  Decoding Binary Node Labels from Censored Edge Measurements: Phase Transition and Efficient Recovery , 2014, IEEE Transactions on Network Science and Engineering.

[7]  Xiaodong Li,et al.  Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes , 2014, ArXiv.

[8]  Babak Hassibi,et al.  Graph Clustering With Missing Data: Convex Algorithms and Analysis , 2014, NIPS.

[9]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[10]  Alexandra Kolla,et al.  Multisection in the Stochastic Block Model using Semidefinite Programming , 2015, ArXiv.

[11]  Roman Vershynin,et al.  Community detection in sparse networks via Grothendieck’s inequality , 2014, Probability Theory and Related Fields.

[12]  Bruce E. Hajek,et al.  Computational Lower Bounds for Community Detection on Random Graphs , 2014, COLT.

[13]  L. Goddard Information Theory , 1962, Nature.

[14]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[15]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[16]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming under the stochastic block model , 2015, 2015 49th Asilomar Conference on Signals, Systems and Computers.

[17]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[18]  Florent Krzakala,et al.  Spectral detection in the censored block model , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[19]  Bruce E. Hajek,et al.  Exact recovery threshold in the binary censored block model , 2015, 2015 IEEE Information Theory Workshop - Fall (ITW).

[20]  David S. Johnson,et al.  Some Simplified NP-Complete Graph Problems , 1976, Theor. Comput. Sci..

[21]  Andrea Montanari,et al.  Semidefinite Programs on Sparse Random Graphs , 2015, ArXiv.

[22]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[23]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[24]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[25]  Stephen A. Vavasis,et al.  Nuclear norm minimization for the planted clique and biclique problems , 2009, Math. Program..

[26]  Sujay Sanghavi,et al.  Clustering Sparse Graphs , 2012, NIPS.

[27]  Laurent Massoulié,et al.  Community Detection in the Labelled Stochastic Block Model , 2012, ArXiv.

[28]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[29]  Andrea Montanari,et al.  Semidefinite programs on sparse random graphs and their application to community detection , 2015, STOC.

[30]  Alexandre Proutière,et al.  Accurate Community Detection in the Stochastic Block Model via Spectral Algorithms , 2014, ArXiv.

[31]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[32]  R. Srikant,et al.  Jointly clustering rows and columns of binary matrices: algorithms and trade-offs , 2013, SIGMETRICS '14.

[33]  D. Welsh,et al.  A Spectral Technique for Coloring Random 3-Colorable Graphs , 1994 .

[34]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[35]  Laurent Massoulié,et al.  Reconstruction in the labeled stochastic block model , 2013, 2013 IEEE Information Theory Workshop (ITW).

[36]  Uriel Feige,et al.  Heuristics for Semirandom Graph Problems , 2001, J. Comput. Syst. Sci..

[37]  Laurent Massoulié,et al.  Distributed user profiling via spectral methods , 2010, SIGMETRICS '10.

[38]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Stephen A. Vavasis,et al.  Convex optimization for the planted k-disjoint-clique problem , 2010, Math. Program..

[40]  T. Tao Topics in Random Matrix Theory , 2012 .

[41]  Alexander S. Wein,et al.  A semidefinite program for unbalanced multisection in the stochastic block model , 2017, 2017 International Conference on Sampling Theory and Applications (SampTA).

[42]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[43]  Alan M. Frieze,et al.  Improved approximation algorithms for MAXk-CUT and MAX BISECTION , 1995, Algorithmica.

[44]  David P. Williamson,et al.  Improved approximation algorithms for MAX SAT , 2000, SODA '00.

[45]  Uriel Feige,et al.  Spectral techniques applied to sparse random graphs , 2005, Random Struct. Algorithms.

[46]  Babak Hassibi,et al.  Sharp performance bounds for graph clustering via convex optimization , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[47]  E. Arias-Castro,et al.  Community detection in dense random networks , 2014 .

[48]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[49]  Yoav Seginer,et al.  The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.

[50]  Bruce E. Hajek,et al.  Achieving Exact Cluster Recovery Threshold via Semidefinite Programming , 2016, IEEE Trans. Inf. Theory.

[51]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[52]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[53]  Brendan P. W. Ames Guaranteed Recovery of Planted Cliques and Dense Subgraphs by Convex Relaxation , 2013, Journal of Optimization Theory and Applications.

[54]  Afonso S. Bandeira,et al.  Random Laplacian Matrices and Convex Relaxations , 2015, Found. Comput. Math..

[55]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[56]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[57]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[58]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.