Computational Lower Bounds for Community Detection on Random Graphs

This paper studies the problem of detecting the presence of a small dense community planted in a large Erd\H{o}s-R\'enyi random graph $\mathcal{G}(N,q)$, where the edge probability within the community exceeds $q$ by a constant factor. Assuming the hardness of the planted clique detection problem, we show that the computational complexity of detecting the community exhibits the following phase transition phenomenon: As the graph size $N$ grows and the graph becomes sparser according to $q=N^{-\alpha}$, there exists a critical value of $\alpha = \frac{2}{3}$, below which there exists a computationally intensive procedure that can detect far smaller communities than any computationally efficient procedure, and above which a linear-time procedure is statistically optimal. The results also lead to the average-case hardness results for recovering the dense community and approximating the densest $K$-subgraph.

[1]  Yu. I. Ingster,et al.  Detection of a sparse submatrix of a high-dimensional noisy matrix , 2011, 1109.0898.

[2]  Ari Juels,et al.  Hiding Cliques for Cryptographic Security , 1998, SODA '98.

[3]  Santosh S. Vempala,et al.  Statistical Algorithms and a Lower Bound for Detecting Planted Cliques , 2012, J. ACM.

[4]  Michael I. Jordan,et al.  Computational and statistical tradeoffs via convex relaxation , 2012, Proceedings of the National Academy of Sciences.

[5]  Ludek Kucera,et al.  Expected Complexity of Graph Partitioning Problems , 1995, Discret. Appl. Math..

[6]  E. Arias-Castro,et al.  Community Detection in Sparse Random Networks , 2013, 1308.2955.

[7]  Cristopher Moore,et al.  Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Yudong Chen,et al.  Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices , 2014, J. Mach. Learn. Res..

[9]  E. Arias-Castro,et al.  Community detection in dense random networks , 2014 .

[10]  Avi Wigderson,et al.  Public-key cryptography from different assumptions , 2010, STOC '10.

[11]  Robert Krauthgamer,et al.  How hard is it to approximate the best Nash equilibrium? , 2009, SODA.

[12]  Robert Krauthgamer,et al.  Finding and certifying a large hidden clique in a semirandom graph , 2000, Random Struct. Algorithms.

[13]  Eli Upfal,et al.  Probability and Computing: Randomized Algorithms and Probabilistic Analysis , 2005 .

[14]  Laurent Massoulié,et al.  Distributed user profiling via spectral methods , 2010, SIGMETRICS '10.

[15]  Sivaraman Balakrishnan,et al.  Minimax Localization of Structural Information in Large Noisy Matrices , 2011, NIPS.

[16]  Sanjeev Arora,et al.  Computational complexity and information asymmetry in financial products , 2011, Commun. ACM.

[17]  Amin Coja-Oghlan,et al.  Graph Partitioning via Adaptive Spectral Techniques , 2009, Combinatorics, Probability and Computing.

[18]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[19]  Philippe Rigollet,et al.  Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[20]  Andrea Montanari,et al.  Finding Hidden Cliques of Size \sqrt{N/e} in Nearly Linear Time , 2013, ArXiv.

[21]  Mark Jerrum,et al.  Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[22]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[23]  Noga Alon,et al.  Testing k-wise and almost k-wise independence , 2007, STOC '07.

[24]  Aditya Bhaskara,et al.  Detecting high log-densities: an O(n¼) approximation for densest k-subgraph , 2010, STOC '10.

[25]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[26]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[27]  Desh Ranjan,et al.  Balls and bins: A study in negative dependence , 1996, Random Struct. Algorithms.

[28]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[29]  R. Srikant,et al.  Jointly clustering rows and columns of binary matrices: algorithms and trade-offs , 2013, SIGMETRICS '14.

[30]  Yihong Wu,et al.  Computational Barriers in Minimax Submatrix Detection , 2013, ArXiv.

[31]  L. Wasserman,et al.  Statistical and computational tradeoffs in biclustering , 2011 .

[32]  Elchanan Mossel,et al.  Consistency thresholds for the planted bisection model , 2016 .

[33]  Stephen A. Vavasis,et al.  Nuclear norm minimization for the planted clique and biclique problems , 2009, Math. Program..

[34]  Brendan P. W. Ames Guaranteed Recovery of Planted Cliques and Dense Subgraphs by Convex Relaxation , 2013, Journal of Optimization Theory and Applications.

[35]  Sanjeev Arora,et al.  Inapproximabilty of Densest κ-Subgraph from Average Case Hardness , 2011 .

[36]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[37]  Yuval Peres,et al.  Finding Hidden Cliques in Linear Time with High Probability , 2010, Combinatorics, Probability and Computing.

[38]  U. Feige,et al.  Finding hidden cliques in linear time , 2009 .

[39]  E. Arias-Castro,et al.  Community Detection in Random Networks , 2013, 1302.7099.