On the non-asymptotic concentration of heteroskedastic Wishart-type matrix

This paper focuses on the non-asymptotic concentration of the heteroskedastic Wishart-type matrices. Suppose $Z$ is a $p_1$-by-$p_2$ random matrix and $Z_{ij} \sim N(0,\sigma_{ij}^2)$ independently, we prove that \begin{equation*} \bbE \left\|ZZ^\top - \bbE ZZ^\top\right\| \leq (1+\epsilon)\left\{2\sigma_C\sigma_R + \sigma_C^2 + C\sigma_R\sigma_*\sqrt{\log(p_1 \wedge p_2)} + C\sigma_*^2\log(p_1 \wedge p_2)\right\}, \end{equation*} where $\sigma_C^2 := \max_j \sum_{i=1}^{p_1}\sigma_{ij}^2$, $\sigma_R^2 := \max_i \sum_{j=1}^{p_2}\sigma_{ij}^2$ and $\sigma_*^2 := \max_{i,j}\sigma_{ij}^2$. A minimax lower bound is developed that matches this upper bound. Then, we derive the concentration inequalities, moments, and tail bounds for the heteroskedastic Wishart-type matrix under more general distributions, such as sub-Gaussian and heavy-tailed distributions. Next, we consider the cases where $Z$ has homoskedastic columns or rows (i.e., $\sigma_{ij} \approx \sigma_i$ or $\sigma_{ij} \approx \sigma_j$) and derive the rate-optimal Wishart-type concentration bounds. Finally, we apply the developed tools to identify the sharp signal-to-noise ratio threshold for consistent clustering in the heteroskedastic clustering problem.

[1]  H. Weyl Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung) , 1912 .

[2]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[3]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[4]  Z. Bai,et al.  Convergence Rate of Expected Spectral Distributions of Large Random Matrices. Part II. Sample Covariance Matrices , 1993 .

[5]  C. Tracy,et al.  Introduction to Random Matrices , 1992, hep-th/9210073.

[6]  A. Syvänen Accessing genetic variation: genotyping single nucleotide polymorphisms , 2001, Nature Reviews Genetics.

[7]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[8]  S. Boucheron,et al.  Moment inequalities for functions of independent random variables , 2005, math/0503651.

[9]  Zhidong Bai,et al.  CONVERGENCE RATE OF EXPECTED SPECTRAL DISTRIBUTIONS OF LARGE RANDOM MATRICES PART II: SAMPLE COVARIANCE MATRICES , 2008 .

[10]  R. Vershynin Spectral norm of products of random and deterministic matrices , 2008, 0812.2432.

[11]  Harrison H. Zhou,et al.  Optimal rates of convergence for covariance matrix estimation , 2010, 1010.3866.

[12]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[13]  Rebecca Willett,et al.  Poisson Noise Reduction with Non-local PCA , 2012, Journal of Mathematical Imaging and Vision.

[14]  T. Tao Topics in Random Matrix Theory , 2012 .

[15]  J. Salmon,et al.  Poisson noise reduction with non-local PCA , 2012, ICASSP.

[16]  Raj Rao Nadakuditi,et al.  The singular values and vectors of low rank perturbations of large rectangular random matrices , 2011, J. Multivar. Anal..

[17]  Andrew B. Nobel,et al.  Reconstruction of a low-rank matrix in the presence of Gaussian noise , 2010, J. Multivar. Anal..

[18]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[19]  V. Koltchinskii,et al.  Concentration inequalities and moment bounds for sample covariance operators , 2014, 1405.2468.

[20]  D. Donoho,et al.  Minimax risk of matrix denoising by singular value thresholding , 2013, 1304.2085.

[21]  A. Bandeira,et al.  Sharp nonasymptotic bounds on the norm of random matrices with independent entries , 2014, 1408.6185.

[22]  Universality for general Wigner-type matrices , 2015, 1506.05098.

[23]  M. Lelarge,et al.  Reconstruction in the Labelled Stochastic Block Model , 2015, IEEE Transactions on Network Science and Engineering.

[24]  Anru R. Zhang,et al.  Rate-Optimal Perturbation Bounds for Singular Subspaces with Applications to High-Dimensional Statistics , 2016, 1605.00353.

[25]  J. Tropp The Expected Norm of a Sum of Independent Random Matrices: An Elementary Approach , 2015, 1506.04711.

[26]  Will Perkins,et al.  Spectral thresholds in the bipartite stochastic block model , 2015, COLT.

[27]  Necdet Batır Bounds for the Gamma Function , 2017, 1705.06167.

[28]  Devavrat Shah,et al.  Rank Centrality: Ranking from Pairwise Comparisons , 2012, Oper. Res..

[29]  R. Handel On the spectral norm of Gaussian random matrices , 2015, 1502.05003.

[30]  Lydia T. Liu,et al.  $e$PCA: High dimensional exponential family PCA , 2016, The Annals of Applied Statistics.

[31]  Pierre Del Moral,et al.  An Introduction to Wishart Matrix Moments , 2017, Found. Trends Mach. Learn..

[32]  Ke Wang,et al.  Singular vector and singular subspace distribution for the matrix denoising model , 2018, 1809.10476.

[33]  Arun K. Kuchibhotla,et al.  Moving Beyond Sub-Gaussianity in High-Dimensional Statistics: Applications in Covariance Estimation and Linear Regression , 2018, 1804.02605.

[34]  R. Latala,et al.  The dimension-free structure of nonhomogeneous random matrices , 2017, Inventiones mathematicae.

[35]  Jeffrey A. Fessler,et al.  Asymptotic performance of PCA for high-dimensional heteroscedastic data , 2017, J. Multivar. Anal..

[36]  S. Girard,et al.  Sub‐Weibull distributions: Generalizing sub‐Gaussian and sub‐Exponential properties to heavier tailed distributions , 2019, Stat.

[37]  Anru R. Zhang,et al.  On the non‐asymptotic and sharp lower tail bounds of random variables , 2018, Stat.

[38]  Anru R. Zhang,et al.  Heteroskedastic PCA: Algorithm, optimality, and applications , 2018, The Annals of Statistics.