On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors

We consider the following detection problem: given a realization of a symmetric matrix X of dimension <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula>, distinguish between the hypothesis that all upper triangular variables are independent and identically distributed (i.i.d). Gaussians variables with mean 0 and variance 1 and the hypothesis, where X is the sum of such matrix and an independent rank-one perturbation. This setup applies to the situation, where under the alternative, there is a planted principal submatrix B of size <inline-formula> <tex-math notation="LaTeX">$L$ </tex-math></inline-formula> for which all upper triangular variables are i.i.d. Gaussians with mean 1 and variance 1, whereas all other upper triangular elements of X not in B are i.i.d. Gaussians variables with mean 0 and variance 1. We refer to this as the “Gaussian hidden clique problem.” When <inline-formula> <tex-math notation="LaTeX">$L=(1+\epsilon )\sqrt {n}$ </tex-math></inline-formula> (<inline-formula> <tex-math notation="LaTeX">$\epsilon >0$ </tex-math></inline-formula>), it is possible to solve this detection problem with probability <inline-formula> <tex-math notation="LaTeX">$1-o_{n}(1)$ </tex-math></inline-formula> by computing the spectrum of X and considering the largest eigenvalue of X. We prove that this condition is tight in the following sense: when <inline-formula> <tex-math notation="LaTeX">$L<(1-\epsilon )\sqrt {n}$ </tex-math></inline-formula> no algorithm that examines only the eigenvalues of X can detect the existence of a hidden Gaussian clique, with error probability vanishing as <inline-formula> <tex-math notation="LaTeX">$n\to \infty $ </tex-math></inline-formula>. We prove this result as an immediate consequence of a more general result on rank-one perturbations of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-dimensional Gaussian tensors. In this context, we establish a lower bound on the critical signal-to-noise ratio below which a rank-one signal cannot be detected.

[1]  János Komlós,et al.  The eigenvalues of random symmetric matrices , 1981, Comb..

[2]  Colin McDiarmid,et al.  Topics in Chromatic Graph Theory: Colouring random graphs , 2015 .

[3]  Ravi B. Boppana,et al.  Eigenvalues and graph bisection: An average-case analysis , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[4]  W. Waterhouse The absolute-value estimate for symmetric multilinear forms☆ , 1990 .

[5]  Mark Jerrum,et al.  Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[6]  C. Tracy,et al.  Introduction to Random Matrices , 1992, hep-th/9210073.

[7]  D. Welsh,et al.  A Spectral Technique for Coloring Random 3-Colorable Graphs , 1994 .

[8]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[9]  U. Feige,et al.  Finding and certifying a large hidden clique in a semirandom graph , 2000, Random Struct. Algorithms.

[10]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[11]  Abraham D. Flaxman,et al.  A spectral technique for random satisfiable 3CNF formulas , 2003, SODA '03.

[12]  A. Guionnet,et al.  A Fourier view on the R-transform and related asymptotics of spherical integrals , 2005 .

[13]  M. Talagrand Free energy of the spherical mean field model , 2006 .

[14]  D. Féral,et al.  The Largest Eigenvalue of Rank One Deformation of Large Wigner Matrices , 2006, math/0605624.

[15]  E. Candès,et al.  Searching for a trail of evidence in a maze , 2007, math/0701668.

[16]  Robert Krauthgamer,et al.  How hard is it to approximate the best Nash equilibrium? , 2009, SODA.

[17]  U. Feige,et al.  Finding hidden cliques in linear time , 2009 .

[18]  Dan Vilenchik,et al.  Small Clique Detection and Approximate Nash Equilibria , 2009, APPROX-RANDOM.

[19]  Santosh S. Vempala,et al.  Spectral Algorithms , 2009, Found. Trends Theor. Comput. Sci..

[20]  Luc Devroye,et al.  Combinatorial Testing Problems , 2009, 0908.3437.

[21]  Antonio Auffinger,et al.  Random Matrices and Complexity of Spin Glasses , 2010, 1003.1129.

[22]  Jun Yin,et al.  The Isotropic Semicircle Law and Deformation of Wigner Matrices , 2011, 1110.6449.

[23]  Sivaraman Balakrishnan,et al.  Minimax Localization of Structural Information in Large Noisy Matrices , 2011, NIPS.

[24]  Statistical and computational tradeoffs in biclustering , 2011 .

[25]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[26]  A. Nobel,et al.  Energy landscape for large average submatrix detection problems in Gaussian random matrices , 2012, 1211.2284.

[27]  Alekh Agarwal,et al.  Computational Trade-offs in Statistical Learning , 2012 .

[28]  Andrea Montanari,et al.  Finding Hidden Cliques of Size \sqrt{N/e} in Nearly Linear Time , 2013, ArXiv.

[29]  Marcelo J. Moreira,et al.  Asymptotic power of sphericity tests for high-dimensional data , 2013, 1306.4867.

[30]  Philippe Rigollet,et al.  Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[31]  Yuval Peres,et al.  Finding Hidden Cliques in Linear Time with High Probability , 2010, Combinatorics, Probability and Computing.

[32]  Yihong Wu,et al.  Computational Barriers in Minimax Submatrix Detection , 2013, ArXiv.

[33]  Andrea Montanari,et al.  A statistical model for tensor PCA , 2014, NIPS.

[34]  A. Dembo,et al.  Matrix Optimization Under Random External Fields , 2014, 1409.4606.

[35]  Andrea Montanari,et al.  Finding Hidden Cliques of Size $$\sqrt{N/e}$$N/e in Nearly Linear Time , 2013, Found. Comput. Math..

[36]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[37]  Bruce E. Hajek,et al.  Submatrix localization via message passing , 2015, J. Mach. Learn. Res..