论文信息 - On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors

On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors

We consider the following detection problem: given a realization of a symmetric matrix X of dimension <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula>, distinguish between the hypothesis that all upper triangular variables are independent and identically distributed (i.i.d). Gaussians variables with mean 0 and variance 1 and the hypothesis, where X is the sum of such matrix and an independent rank-one perturbation. This setup applies to the situation, where under the alternative, there is a planted principal submatrix B of size <inline-formula> <tex-math notation="LaTeX">$L$ </tex-math></inline-formula> for which all upper triangular variables are i.i.d. Gaussians with mean 1 and variance 1, whereas all other upper triangular elements of X not in B are i.i.d. Gaussians variables with mean 0 and variance 1. We refer to this as the “Gaussian hidden clique problem.” When <inline-formula> <tex-math notation="LaTeX">$L=(1+\epsilon )\sqrt {n}$ </tex-math></inline-formula> (<inline-formula> <tex-math notation="LaTeX">$\epsilon >0$ </tex-math></inline-formula>), it is possible to solve this detection problem with probability <inline-formula> <tex-math notation="LaTeX">$1-o_{n}(1)$ </tex-math></inline-formula> by computing the spectrum of X and considering the largest eigenvalue of X. We prove that this condition is tight in the following sense: when <inline-formula> <tex-math notation="LaTeX">$L<(1-\epsilon )\sqrt {n}$ </tex-math></inline-formula> no algorithm that examines only the eigenvalues of X can detect the existence of a hidden Gaussian clique, with error probability vanishing as <inline-formula> <tex-math notation="LaTeX">$n\to \infty $ </tex-math></inline-formula>. We prove this result as an immediate consequence of a more general result on rank-one perturbations of <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-dimensional Gaussian tensors. In this context, we establish a lower bound on the critical signal-to-noise ratio below which a rank-one signal cannot be detected.

[1] János Komlós,et al. The eigenvalues of random symmetric matrices , 1981, Comb..

[2] Colin McDiarmid,et al. Topics in Chromatic Graph Theory: Colouring random graphs , 2015 .

[3] Ravi B. Boppana,et al. Eigenvalues and graph bisection: An average-case analysis , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[4] W. Waterhouse. The absolute-value estimate for symmetric multilinear forms☆ , 1990 .

[5] Mark Jerrum,et al. Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[6] C. Tracy,et al. Introduction to Random Matrices , 1992, hep-th/9210073.

[7] D. Welsh,et al. A Spectral Technique for Coloring Random 3-Colorable Graphs , 1994 .

[8] Noga Alon,et al. Finding a large hidden clique in a random graph , 1998, SODA '98.

[9] U. Feige,et al. Finding and certifying a large hidden clique in a semirandom graph , 2000, Random Struct. Algorithms.

[10] Frank McSherry,et al. Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[11] Abraham D. Flaxman,et al. A spectral technique for random satisfiable 3CNF formulas , 2003, SODA '03.

[12] A. Guionnet,et al. A Fourier view on the R-transform and related asymptotics of spherical integrals , 2005 .

[13] M. Talagrand. Free energy of the spherical mean field model , 2006 .

[14] D. Féral,et al. The Largest Eigenvalue of Rank One Deformation of Large Wigner Matrices , 2006, math/0605624.

[15] E. Candès,et al. Searching for a trail of evidence in a maze , 2007, math/0701668.

[16] Robert Krauthgamer,et al. How hard is it to approximate the best Nash equilibrium? , 2009, SODA.

[17] U. Feige,et al. Finding hidden cliques in linear time , 2009 .

[18] Dan Vilenchik,et al. Small Clique Detection and Approximate Nash Equilibria , 2009, APPROX-RANDOM.

[19] Santosh S. Vempala,et al. Spectral Algorithms , 2009, Found. Trends Theor. Comput. Sci..

[20] Luc Devroye,et al. Combinatorial Testing Problems , 2009, 0908.3437.

[21] Antonio Auffinger,et al. Random Matrices and Complexity of Spin Glasses , 2010, 1003.1129.

[22] Jun Yin,et al. The Isotropic Semicircle Law and Deformation of Wigner Matrices , 2011, 1110.6449.

[23] Sivaraman Balakrishnan,et al. Minimax Localization of Structural Information in Large Noisy Matrices , 2011, NIPS.

[24] Statistical and computational tradeoffs in biclustering , 2011 .

[25] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[26] A. Nobel,et al. Energy landscape for large average submatrix detection problems in Gaussian random matrices , 2012, 1211.2284.

[27] Alekh Agarwal,et al. Computational Trade-offs in Statistical Learning , 2012 .

[28] Andrea Montanari,et al. Finding Hidden Cliques of Size \sqrt{N/e} in Nearly Linear Time , 2013, ArXiv.

[29] Marcelo J. Moreira,et al. Asymptotic power of sphericity tests for high-dimensional data , 2013, 1306.4867.

[30] Philippe Rigollet,et al. Complexity Theoretic Lower Bounds for Sparse Principal Component Detection , 2013, COLT.

[31] Yuval Peres,et al. Finding Hidden Cliques in Linear Time with High Probability , 2010, Combinatorics, Probability and Computing.

[32] Yihong Wu,et al. Computational Barriers in Minimax Submatrix Detection , 2013, ArXiv.

[33] Andrea Montanari,et al. A statistical model for tensor PCA , 2014, NIPS.

[34] A. Dembo,et al. Matrix Optimization Under Random External Fields , 2014, 1409.4606.

[35] Andrea Montanari,et al. Finding Hidden Cliques of Size $$\sqrt{N/e}$$N/e in Nearly Linear Time , 2013, Found. Comput. Math..

[36] Dean Alderucci. A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[37] Bruce E. Hajek,et al. Submatrix localization via message passing , 2015, J. Mach. Learn. Res..