The Landscape of the Planted Clique Problem: Dense subgraphs and the Overlap Gap Property

In this paper we study the computational-statistical gap of the planted clique problem, where a clique of size $k$ is planted in an Erdos Renyi graph $G(n,\frac{1}{2})$ resulting in a graph $G\left(n,\frac{1}{2},k\right)$. The goal is to recover the planted clique vertices by observing $G\left(n,\frac{1}{2},k\right)$ . It is known that the clique can be recovered as long as $k \geq \left(2+\epsilon\right)\log n $ for any $\epsilon>0$, but no polynomial-time algorithm is known for this task unless $k=\Omega\left(\sqrt{n} \right)$. Following a statistical-physics inspired point of view as an attempt to understand this computational-statistical gap, we study the landscape of the "sufficiently dense" subgraphs of $G$ and their overlap with the planted clique. Using the first moment method, we study the densest subgraph problems for subgraphs with fixed, but arbitrary, overlap size with the planted clique, and provide evidence of a phase transition for the presence of Overlap Gap Property (OGP) at $k=\Theta\left(\sqrt{n}\right)$. OGP is a concept introduced originally in spin glass theory and known to suggest algorithmic hardness when it appears. We establish the presence of OGP when $k$ is a small positive power of $n$ by using a conditional second moment method. As our main technical tool, we establish the first, to the best of our knowledge, concentration results for the $K$-densest subgraph problem for the Erdos-Renyi model $G\left(n,\frac{1}{2}\right)$ when $K=n^{0.5-\epsilon}$ for arbitrary $\epsilon>0$. Finally, to study the OGP we employ a certain form of overparametrization, which is conceptually aligned with a large body of recent work in learning theory and optimization.

[1]  Ohad Shamir,et al.  Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.

[2]  Arian Maleki,et al.  Benefits of over-parameterization with EM , 2018, NeurIPS.

[3]  Andrea Montanari,et al.  Optimization of the Sherrington-Kirkpatrick Hamiltonian , 2018, 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).

[4]  Mark Jerrum,et al.  Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[5]  Yuval Peres,et al.  Finding Hidden Cliques in Linear Time with High Probability , 2010, Combinatorics, Probability and Computing.

[6]  Galen Reeves,et al.  The All-or-Nothing Phenomenon in Sparse Linear Regression , 2019, COLT.

[7]  Yaniv Plan,et al.  Average-case hardness of RIP certification , 2016, NIPS.

[8]  Andrea Montanari,et al.  Reconstruction and Clustering in Random Constraint Satisfaction Problems , 2011, SIAM J. Discret. Math..

[9]  S. Sen,et al.  On the unbalanced cut problem and the generalized Sherrington–Kirkpatrick model , 2017, 1707.09042.

[10]  Alan Frieze,et al.  Random Structures and Algorithms , 2014 .

[11]  Andrea Montanari,et al.  Gibbs states and the set of solutions of random constraint satisfaction problems , 2006, Proceedings of the National Academy of Sciences.

[12]  Joan Bruna,et al.  Neural Networks with Finite Intrinsic Dimension have no Spurious Valleys , 2018, ArXiv.

[13]  Ludek Kucera,et al.  Expected Complexity of Graph Partitioning Problems , 1995, Discret. Appl. Math..

[14]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[15]  David Gamarnik,et al.  The overlap gap property and approximate message passing algorithms for $p$-spin models , 2019, The Annals of Probability.

[16]  Wasim Huleihel,et al.  Reducibility and Computational Lower Bounds for Problems with Planted Sparse Structure , 2018, COLT.

[17]  Colin McDiarmid,et al.  Topics in Chromatic Graph Theory: Colouring random graphs , 2015 .

[18]  Amin Coja-Oghlan,et al.  On independent sets in random graphs , 2010, SODA '11.

[19]  Tengyuan Liang,et al.  Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix , 2015, 1502.01988.

[20]  Eliran Subag Following the Ground States of Full‐RSB Spherical Spin Glasses , 2018 .

[21]  Samuel Hetterich,et al.  Analysing Survey Propagation Guided Decimation on Random Formulas , 2016, ICALP.

[22]  Hongyang Zhang,et al.  Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations , 2017, COLT.

[23]  M. Talagrand Mean Field Models for Spin Glasses , 2011 .

[24]  A. COJA-OGHLAN,et al.  Walksat Stalls Well Below Satisfiability , 2016, SIAM J. Discret. Math..

[25]  N. Alon,et al.  Finding a large hidden clique in a random graph , 1998 .

[26]  Jiaming Xu,et al.  Statistical Problems with Planted Structures: Information-Theoretical and Computational Limits , 2018, Information-Theoretic Methods in Data Science.

[27]  Bernhard Klar,et al.  BOUNDS ON TAIL PROBABILITIES OF DISCRETE DISTRIBUTIONS , 2000, Probability in the Engineering and Informational Sciences.

[28]  David Gamarnik,et al.  Sparse High-Dimensional Linear Regression. Algorithmic Barriers and a Local Search Algorithm , 2017, 1711.04952.

[29]  Béla Bollobás,et al.  Dense subgraphs in random graphs , 2018, Discret. Appl. Math..

[30]  Dmitry Panchenko,et al.  Suboptimality of local algorithms for a class of max-cut problems , 2017, The Annals of Probability.

[31]  Joan Bruna,et al.  Spurious Valleys in Two-layer Neural Network Optimization Landscapes , 2018, 1802.06384.

[32]  Pravesh Kothari,et al.  A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[33]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[34]  Santosh S. Vempala,et al.  Statistical Algorithms and a Lower Bound for Detecting Planted Cliques , 2012, J. ACM.

[35]  U. Feige,et al.  Finding hidden cliques in linear time , 2009 .

[36]  Afonso S. Bandeira,et al.  Notes on computational-to-statistical gaps: predictions using statistical physics , 2018, Portugaliae Mathematica.

[37]  G. B. Arous,et al.  Algorithmic thresholds for tensor PCA , 2018, The Annals of Probability.

[38]  Pu Gao,et al.  The Satisfiability Threshold For Random Linear Equations , 2017, Comb..

[39]  Thierry Mora,et al.  Clustering of solutions in the random satisfiability problem , 2005, Physical review letters.

[40]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[41]  David Gamarnik,et al.  Finding a large submatrix of a Gaussian random matrix , 2016, The Annals of Statistics.

[42]  Bálint Virág,et al.  Local algorithms for independent sets are half-optimal , 2014, ArXiv.