Detection of Signal in the Spiked Rectangular Models

We consider the problem of detecting signals in the rank-one signal-plus-noise data matrix models that generalize the spiked Wishart matrices. We show that the principal component analysis can be improved by pre-transforming the matrix entries if the noise is non-Gaussian. As an intermediate step, we prove a sharp phase transition of the largest eigenvalues of spiked rectangular matrices, which extends the Baik–Ben Arous–Péché (BBP) transition. We also propose a hypothesis test to detect the presence of signal with low computational complexity, based on the linear spectral statistics, which minimizes the sum of the Type-I and Type-II errors when the noise is Gaussian.

[1]  Jun Yin,et al.  Anisotropic local laws for random matrices , 2014, 1410.3516.

[2]  Jun Yan,et al.  Adapting to Unknown Noise Distribution in Matrix Denoising , 2018, ArXiv.

[3]  Ronald F. Boisvert,et al.  NIST Handbook of Mathematical Functions , 2010 .

[4]  L. Pastur,et al.  CENTRAL LIMIT THEOREM FOR LINEAR EIGENVALUE STATISTICS OF RANDOM MATRICES WITH INDEPENDENT ENTRIES , 2008, 0809.4698.

[5]  J. Baik,et al.  Fluctuations of the Free Energy of the Spherical Sherrington–Kirkpatrick Model , 2015, Journal of Statistical Physics.

[6]  Florent Krzakala,et al.  MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[7]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[8]  Marcelo J. Moreira,et al.  Asymptotic power of sphericity tests for high-dimensional data , 2013, 1306.4867.

[9]  S. Péché,et al.  Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , 2004, math/0403022.

[10]  Michael I. Jordan,et al.  Detection limits in the high-dimensional spiked rectangular model , 2018, COLT.

[11]  Raj Rao Nadakuditi,et al.  The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices , 2009, 0910.2120.

[12]  Andrea Montanari,et al.  On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors , 2014, IEEE Transactions on Information Theory.

[13]  H. Yau,et al.  Spectral statistics of Erdős–Rényi graphs I: Local semicircle law , 2011, 1103.1919.

[14]  Ji Hyung Jung,et al.  Weak Detection in the Spiked Wigner Model with General Rank , 2020, ArXiv.

[15]  E. Dobriban,et al.  Sharp detection in PCA under correlations: all eigenvalues matter , 2016, 1602.06896.

[16]  I. Johnstone High Dimensional Statistical Inference and Random Matrices , 2006, math/0611589.

[17]  J. Baik,et al.  Fluctuations of the Free Energy of the Spherical Sherrington–Kirkpatrick Model with Ferromagnetic Interaction , 2016, Annales Henri Poincaré.

[18]  Michael I. Jordan,et al.  Fundamental limits of detection in the spiked Wigner model , 2018, 1806.09588.

[19]  J. Lee,et al.  Tracy-Widom Distribution for the Largest Eigenvalue of Real Sample Covariance Matrices with General Population , 2014, 1409.4979.

[20]  H. Yau,et al.  On the principal components of sample covariance matrices , 2014, 1404.0788.

[21]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[22]  Ankur Moitra,et al.  Optimality and Sub-optimality of PCA I: Spiked Random Matrix Models , 2018, The Annals of Statistics.

[23]  Hye Won Chung,et al.  Weak Detection of Signal in the Spiked Wigner Model , 2018, ICML.

[24]  Raj Rao Nadakuditi,et al.  The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..

[25]  H. Yau,et al.  Isotropic local laws for sample covariance and generalized Wigner matrices , 2013, 1308.5729.

[26]  J. Baik,et al.  Free energy of bipartite spherical Sherrington–Kirkpatrick model , 2017, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.

[27]  Z. Bai,et al.  CLT for linear spectral statistics of large dimensional sample covariance matrices with dependent data , 2017, Statistical Papers.

[28]  Alexei Onatski,et al.  Signal detection in high dimension: The multispiked case , 2012, 1210.5663.