Detection limits in the high-dimensional spiked rectangular model

We study the problem of detecting the presence of a single unknown spike in a rectangular data matrix, in a high-dimensional regime where the spike has fixed strength and the aspect ratio of the matrix converges to a finite limit. This setup includes Johnstone's spiked covariance model. We analyze the likelihood ratio of the spiked model against an "all noise" null model of reference, and show it has asymptotically Gaussian fluctuations in a region below---but in general not up to---the so-called BBP threshold from random matrix theory. Our result parallels earlier findings of Onatski et al.\ (2013) and Johnstone-Onatski (2015) for spherical spikes. We present a probabilistic approach capable of treating generic product priors. In particular, sparsity in the spike is allowed. Our approach is based on Talagrand's interpretation of the cavity method from spin-glass theory. The question of the maximal parameter region where asymptotic normality is expected to hold is left open. This region is shaped by the prior in a non-trivial way. We conjecture that this is the entire paramagnetic phase of an associated spin-glass model, and is defined by the vanishing of the replica-symmetric solution of Lesieur et al.\ (2015).

[1]  G. Pisier Probabilistic methods in the geometry of Banach spaces , 1986 .

[2]  D. Ruelle,et al.  Some rigorous results on the Sherrington-Kirkpatrick spin glass model , 1987 .

[3]  A. V. D. Vaart,et al.  Asymptotic Statistics: U -Statistics , 1998 .

[4]  H. Nishimori Statistical Physics of Spin Glasses and Information Processing , 2001 .

[5]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[6]  西森 秀稔 Statistical physics of spin glasses and information processing : an introduction , 2001 .

[7]  Olivier Ledoit,et al.  Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size , 2002 .

[8]  Pisa,et al.  Quadratic replica coupling in the Sherrington-Kirkpatrick mean field spin glass model , 2002, cond-mat/0201091.

[9]  F. Guerra Broken Replica Symmetry Bounds in the Mean Field Spin Glass Model , 2002, cond-mat/0205123.

[10]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[11]  J. W. Silverstein,et al.  Eigenvalues of large sample covariance matrices of spiked population models , 2004, math/0408165.

[12]  S. Péché The largest eigenvalue of small rank perturbations of Hermitian random matrices , 2006 .

[13]  M. Talagrand Mean Field Models for Spin Glasses: Some Obnoxious Problems , 2007 .

[14]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[15]  D. Féral,et al.  The Largest Eigenvalue of Rank One Deformation of Large Wigner Matrices , 2006, math/0605624.

[16]  Martin J. Wainwright,et al.  High-dimensional analysis of semidefinite relaxations for sparse principal components , 2008, ISIT.

[17]  Z. Bai,et al.  Central limit theorems for eigenvalues in a spiked population model , 2008, 0806.2503.

[18]  C. Donati-Martin,et al.  The largest eigenvalues of finite rank deformation of large Wigner matrices: Convergence and nonuniversality of the fluctuations. , 2007, 0706.0136.

[19]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[20]  B. Nadler Finite sample approximation results for principal component analysis: a matrix perturbation approach , 2009, 0901.3245.

[21]  Satish Babu Korada,et al.  Exact Solution of the Gauge Symmetric p-Spin Glass Model on a Complete Graph , 2009 .

[22]  Raj Rao Nadakuditi,et al.  The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices , 2009, 0910.2120.

[23]  M. Talagrand Advanced replica-symmetry and low temperature , 2011 .

[24]  A. Barra,et al.  Equilibrium statistical mechanics of bipartite spin systems , 2010, 1012.1261.

[25]  M. Talagrand Mean Field Models for Spin Glasses , 2011 .

[26]  Raj Rao Nadakuditi,et al.  The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..

[27]  Alexei Onatski,et al.  Signal detection in high dimension: The multispiked case , 2012, 1210.5663.

[28]  Jianfeng Yao,et al.  On sample eigenvalues in a generalized spiked population model , 2008, J. Multivar. Anal..

[29]  P. Rigollet,et al.  Optimal detection of sparse principal components in high dimension , 2012, 1202.5070.

[30]  Marcelo J. Moreira,et al.  Asymptotic power of sphericity tests for high-dimensional data , 2013, 1306.4867.

[31]  D. Panchenko The free energy in a multi-species Sherrington-Kirkpatrick model , 2013, 1310.6679.

[32]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[33]  A. Barra,et al.  Mean field bipartite spin models treated with mechanical techniques , 2013, 1310.5901.

[34]  Antonio Auffinger,et al.  Free Energy and Complexity of Spherical Bipartite Models , 2014, 1405.2321.

[35]  S. Chatterjee Superconcentration and Related Topics , 2014 .

[36]  Florent Krzakala,et al.  Phase transitions in sparse PCA , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[37]  I. Johnstone,et al.  Testing in high-dimensional spiked models , 2015, The Annals of Statistics.

[38]  Florent Krzakala,et al.  MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[39]  J. Baik,et al.  Fluctuations of the Free Energy of the Spherical Sherrington–Kirkpatrick Model , 2015, Journal of Statistical Physics.

[40]  J. Baik,et al.  Fluctuations of the Free Energy of the Spherical Sherrington–Kirkpatrick Model with Ferromagnetic Interaction , 2016, Annales Henri Poincaré.

[41]  Florent Krzakala,et al.  Mutual information in rank-one matrix estimation , 2016, 2016 IEEE Information Theory Workshop (ITW).

[42]  Andrea Montanari,et al.  Asymptotic mutual information for the binary stochastic block model , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[43]  E. Dobriban,et al.  Sharp detection in PCA under correlations: all eigenvalues matter , 2016, 1602.06896.

[44]  Nicolas Macris,et al.  Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula , 2016, NIPS.

[45]  Léo Miolane Fundamental limits of low-rank matrix estimation , 2017 .

[46]  Michael I. Jordan,et al.  Finite Size Corrections and Likelihood Ratio Fluctuations in the Spiked Wigner Model , 2017, ArXiv.

[47]  Andrea Montanari,et al.  On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors , 2014, IEEE Transactions on Information Theory.

[48]  Marc Lelarge,et al.  Fundamental limits of symmetric low-rank matrix estimation , 2016, Probability Theory and Related Fields.

[49]  Léo Miolane Fundamental limits of low-rank matrix estimation , 2017, 1702.00473.

[50]  Florent Krzakala,et al.  Constrained low-rank matrix estimation: phase transitions, approximate message passing and applications , 2017, ArXiv.

[51]  J. Baik,et al.  Free energy of bipartite spherical Sherrington–Kirkpatrick model , 2017, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.

[52]  Nicolas Macris,et al.  Phase Transitions, Optimal Errors and Optimality of Message-Passing in Generalized Linear Models , 2017, ArXiv.

[53]  Jess Banks,et al.  Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization , 2017, ISIT.

[54]  Ankur Moitra,et al.  Optimality and Sub-optimality of PCA I: Spiked Random Matrix Models , 2018, The Annals of Statistics.