Semidefinite programs on sparse random graphs and their application to community detection

Denote by A the adjacency matrix of an Erdos-Renyi graph with bounded average degree. We consider the problem of maximizing over the set of positive semidefinite matrices X with diagonal entries X_ii=1. We prove that for large (bounded) average degree d, the value of this semidefinite program (SDP) is --with high probability-- 2n*sqrt(d) + n, o(sqrt(d))+o(n). For a random regular graph of degree d, we prove that the SDP value is 2n*sqrt(d-1)+o(n), matching a spectral upper bound. Informally, Erdos-Renyi graphs appear to behave similarly to random regular graphs for semidefinite programming. We next consider the sparse, two-groups, symmetric community detection problem (also known as planted partition). We establish that SDP achieves the information-theoretically optimal detection threshold for large (bounded) degree. Namely, under this model, the vertex set is partitioned into subsets of size n/2, with edge probability a/n (within group) and b/n (across). We prove that SDP detects the partition with high probability provided (a-b)^2/(4d)> 1+o_d(1), with d= (a+b)/2. By comparison, the information theoretic threshold for detecting the hidden partition is (a-b)^2/(4d)> 1: SDP is nearly optimal for large bounded average degree. Our proof is based on tools from different research areas: (i) A new 'higher-rank' Grothendieck inequality for symmetric matrices; (ii) An interpolation method inspired from statistical physics; (iii) An analysis of the eigenvectors of deformed Gaussian random matrices.

[1]  Joel Friedman,et al.  A proof of Alon's second eigenvalue conjecture and related problems , 2004, ArXiv.

[2]  Frank Vallentin,et al.  Grothendieck Inequalities for Semidefinite Programs with Rank Constraint , 2010, Theory Comput..

[3]  Frank Vallentin,et al.  The Positive Semidefinite Grothendieck Problem with Rank Constraint , 2009, ICALP.

[4]  Laurent Massoulié,et al.  Non-backtracking Spectrum of Random Graphs: Community Detection and Non-regular Ramanujan Graphs , 2014, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[5]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[6]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[7]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[8]  Jun Yin,et al.  The Isotropic Semicircle Law and Deformation of Wigner Matrices , 2011, 1110.6449.

[9]  A. Soshnikov,et al.  On finite rank deformations of Wigner matrices , 2011, 1103.3731.

[10]  Ankur Moitra,et al.  How robust are reconstruction thresholds for community detection? , 2015, STOC.

[11]  Tamás Terlaky,et al.  On maximization of quadratic form over intersection of ellipsoids with common center , 1999, Math. Program..

[12]  Amin Coja-Oghlan,et al.  A spectral heuristic for bisecting random graphs , 2005, SODA '05.

[13]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[14]  Ravishankar Krishnaswamy,et al.  Relax, No Need to Round: Integrality of Clustering Formulations , 2014, ITCS.

[15]  U. Feige,et al.  Spectral techniques applied to sparse random graphs , 2005 .

[16]  Subhash Khot,et al.  Grothendieck‐Type Inequalities in Combinatorial Optimization , 2011, ArXiv.

[17]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[18]  Svante Janson,et al.  Random graphs , 2000, Wiley-Interscience series in discrete mathematics and optimization.

[19]  C. Donati-Martin,et al.  Free Convolution with a Semicircular Distribution and Eigenvalues of Spiked Deformations of Wigner Matrices , 2010, 1006.3684.

[20]  A. Dembo,et al.  Gibbs Measures and Phase Transitions on Sparse Random Graphs , 2009, 0910.5460.

[21]  Emmanuel Abbe,et al.  Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[22]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[23]  Elchanan Mossel,et al.  Spectral redemption in clustering sparse networks , 2013, Proceedings of the National Academy of Sciences.

[24]  Can M. Le,et al.  Concentration and regularization of random graphs , 2015, Random Struct. Algorithms.

[25]  Amin Coja-Oghlan The Lovász Number of Random Graphs , 2003, RANDOM-APPROX.

[26]  A. Grothendieck Résumé de la théorie métrique des produits tensoriels topologiques , 1996 .

[27]  Andrea Montanari,et al.  Extremal Cuts of Sparse Random Graphs , 2015, ArXiv.

[28]  Roman Vershynin,et al.  Community detection in sparse networks via Grothendieck’s inequality , 2014, Probability Theory and Related Fields.

[29]  Emmanuel Abbe,et al.  Community detection in general stochastic block models: fundamental limits and efficient recovery algorithms , 2015, ArXiv.

[30]  David Gamarnik,et al.  Combinatorial approach to the interpolation method and scaling limits in sparse random graphs , 2010, STOC '10.

[31]  R. Rietz A proof of the Grothendieck inequality , 1974 .

[32]  A. Guionnet,et al.  An Introduction to Random Matrices , 2009 .

[33]  Amin Coja-Oghlan,et al.  Graph Partitioning via Adaptive Spectral Techniques , 2009, Combinatorics, Probability and Computing.

[34]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[35]  M. Talagrand,et al.  Bounds for diluted mean-fields spin glass models , 2004, math/0405357.

[36]  T. Tao Topics in Random Matrix Theory , 2012 .

[37]  Bruce E. Hajek,et al.  Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions , 2015, IEEE Transactions on Information Theory.

[38]  Michele Leone,et al.  Replica Bounds for Optimization Problems and Diluted Spin Systems , 2002 .

[39]  F. Guerra,et al.  The High Temperature Region of the Viana–Bray Diluted Spin Glass Model , 2003, cond-mat/0302401.

[40]  A. Megretski Relaxations of Quadratic Programs in Operator Theory and System Analysis , 2001 .

[41]  J. Lindeberg Eine neue Herleitung des Exponentialgesetzes in der Wahrscheinlichkeitsrechnung , 1922 .

[42]  Alex Bloemendal,et al.  Limits of spiked random matrices I , 2010, Probability Theory and Related Fields.

[43]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[44]  Amit Singer,et al.  Multireference alignment using semidefinite programming , 2013, ITCS.

[45]  A. Guionnet,et al.  Large deviations of the extreme eigenvalues of random deformations of matrices , 2010, Probability Theory and Related Fields.

[46]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[47]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[48]  Grant Schoenebeck,et al.  Linear Level Lasserre Lower Bounds for Certain k-CSPs , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[49]  Alexandra Kolla,et al.  How to Play Unique Games Against a Semi-random Adversary: Study of Semi-random Models of Unique Games , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[50]  Benny Sudakov,et al.  The Largest Eigenvalue of Sparse Random Graphs , 2001, Combinatorics, Probability and Computing.

[51]  Anup Rao,et al.  Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery , 2015, COLT.

[52]  S. Franz,et al.  Replica bounds for diluted non-Poissonian spin systems , 2003, cond-mat/0307367.

[53]  Elchanan Mossel,et al.  A Proof of the Block Model Threshold Conjecture , 2013, Combinatorica.

[54]  Amin Coja-Oghlan The Lovasz number of random graph , 2003, Electron. Colloquium Comput. Complex..

[55]  Andris Ambainis,et al.  Quantum Strategies Are Better Than Classical in Almost Any XOR Game , 2011, ICALP.

[56]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[57]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[58]  Dima Grigoriev,et al.  Linear lower bound on degrees of Positivstellensatz calculus proofs for the parity , 2001, Theor. Comput. Sci..

[59]  Noga Alon,et al.  Quadratic forms on graphs , 2005, STOC '05.

[60]  D. Féral,et al.  The Largest Eigenvalue of Rank One Deformation of Large Wigner Matrices , 2006, math/0605624.

[61]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[62]  Adel Javanmard,et al.  Phase transitions in semidefinite relaxations , 2015, Proceedings of the National Academy of Sciences.

[63]  S. Chatterjee A simple invariance theorem , 2005, math/0508213.

[64]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..