Subspace Estimation from Unbalanced and Incomplete Data Matrices: $\ell_{2,\infty}$ Statistical Guarantees.

This paper is concerned with estimating the column space of an unknown low-rank matrix $\boldsymbol{A}^{\star}\in\mathbb{R}^{d_{1}\times d_{2}}$, given noisy and partial observations of its entries. There is no shortage of scenarios where the observations --- while being too noisy to support faithful recovery of the entire matrix --- still convey sufficient information to enable reliable estimation of the column space of interest. This is particularly evident and crucial for the highly unbalanced case where the column dimension $d_{2}$ far exceeds the row dimension $d_{1}$, which is the focal point of the current paper. We investigate an efficient spectral method, which operates upon the sample Gram matrix with diagonal deletion. We establish statistical guarantees for this method in terms of both $\ell_{2}$ and $\ell_{2,\infty}$ estimation accuracy, which improve upon prior results if $d_{2}$ is substantially larger than $d_{1}$. To illustrate the effectiveness of our findings, we develop consequences of our general theory for three applications of practical importance: (1) tensor completion from noisy data, (2) covariance estimation with missing data, and (3) community recovery in bipartite graphs. Our theory leads to improved performance guarantees for all three cases.

[1]  Tengyao Wang,et al.  High‐dimensional principal component analysis with heterogeneous missingness , 2019, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[2]  Ankur Moitra,et al.  Noisy tensor completion via the sum-of-squares hierarchy , 2015, Mathematical Programming.

[3]  Antonio Ortega,et al.  Covariance Matrix Estimation With Non Uniform and Data Dependent Missing Observations , 2019, IEEE Transactions on Information Theory.

[4]  Chen Cheng,et al.  Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices , 2018, ArXiv.

[5]  Jianqing Fan,et al.  Robust high dimensional factor models with applications to statistical machine learning. , 2018, Statistical science : a review journal of the Institute of Mathematical Statistics.

[6]  Ming Yuan,et al.  Statistically Optimal and Computationally Efficient Low Rank Tensor Completion from Noisy Entries , 2017, The Annals of Statistics.

[7]  Ji Chen,et al.  Nonconvex Rectangular Matrix Completion via Gradient Descent Without ℓ₂,∞ Regularization , 2020, IEEE Transactions on Information Theory.

[8]  Yuling Yan,et al.  Noisy Matrix Completion: Understanding Statistical Guarantees for Convex Relaxation via Nonconvex Optimization , 2019, SIAM J. Optim..

[9]  Zhixin Zhou,et al.  Optimal Bipartite Network Clustering , 2018, J. Mach. Learn. Res..

[10]  Anru Zhang,et al.  Sparse and Low-Rank Tensor Estimation via Cubic Sketchings , 2018, IEEE Transactions on Information Theory.

[11]  Yuxin Chen,et al.  Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution , 2017, Found. Comput. Math..

[12]  Jianqing Fan,et al.  ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK. , 2017, Annals of statistics.

[13]  Purnamrita Sarkar,et al.  Estimating Mixed Memberships With Sharp Eigenvector Deviations , 2017, Journal of the American Statistical Association.

[14]  H. Vincent Poor,et al.  Nonconvex Low-Rank Symmetric Tensor Completion from Noisy Data , 2019, NeurIPS 2019.

[15]  H. Vincent Poor,et al.  Nonconvex Low-Rank Tensor Completion from Noisy Data , 2019, NeurIPS.

[16]  Martin J. Wainwright,et al.  Value function estimation in Markov reward processes: Instance-dependent 𝓁∞-bounds for policy evaluation , 2019, ArXiv.

[17]  Lihua Lei Unified $\ell_{2\rightarrow\infty}$ Eigenspace Perturbation Theory for Symmetric Random Matrices , 2019, 1909.04798.

[18]  Yuling Yan,et al.  Inference and uncertainty quantification for noisy matrix completion , 2019, Proceedings of the National Academy of Sciences.

[19]  Devavrat Shah,et al.  On Robustness of Principal Component Regression , 2019, NeurIPS.

[20]  Yuxin Chen,et al.  Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval , 2018, Mathematical Programming.

[21]  Xiaodong Li,et al.  Nonconvex Rectangular Matrix Completion via Gradient Descent without $\ell_{2,\infty}$ Regularization , 2019 .

[22]  Andreas Elsener,et al.  Sparse spectral estimation with missing and corrupted measurements , 2018, Stat.

[23]  Yuxin Chen,et al.  Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview , 2018, IEEE Transactions on Signal Processing.

[24]  C. Priebe,et al.  Signal‐plus‐noise matrix models: eigenvector deviations and fluctuations , 2018, Biometrika.

[25]  Yuxin Chen,et al.  Spectral Method and Regularized MLE Are Both Optimal for Top-$K$ Ranking , 2017, Annals of statistics.

[26]  Fan Zhou,et al.  The Sup-norm Perturbation of HOSVD and Low Rank Tensor Denoising , 2019, J. Mach. Learn. Res..

[27]  Yuxin Chen,et al.  The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square , 2017, Probability Theory and Related Fields.

[28]  C. Priebe,et al.  The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics , 2017, The Annals of Statistics.

[29]  Ming Yuan,et al.  On Polynomial Time Methods for Exact Low-Rank Tensor Completion , 2017, Found. Comput. Math..

[30]  Yuejie Chi,et al.  Streaming PCA and Subspace Tracking: The Missing Data Case , 2018, Proceedings of the IEEE.

[31]  Jianqing Fan,et al.  An l∞ Eigenvector Perturbation Bound and Its Application to Robust Covariance Estimation , 2018, Journal of machine learning research : JMLR.

[32]  Yudong Chen,et al.  Harnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation: Recent Theory and Fast Algorithms via Convex and Nonconvex Optimization , 2018, IEEE Signal Processing Magazine.

[33]  Noureddine El Karoui,et al.  On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators , 2018 .

[34]  Mikhail Belkin,et al.  Unperturbed: spectral analysis beyond Davis-Kahan , 2017, ALT.

[35]  Emmanuel Abbe,et al.  Community Detection and Stochastic Block Models , 2017, Found. Trends Commun. Inf. Theory.

[36]  Nicolas Boumal,et al.  Near-Optimal Bounds for Phase Synchronization , 2017, SIAM J. Optim..

[37]  Anru R. Zhang,et al.  Tensor SVD: Statistical and Computational Limits , 2017, IEEE Transactions on Information Theory.

[38]  Yuxin Chen,et al.  The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences , 2016, Communications on Pure and Applied Mathematics.

[39]  Xiaodong Li,et al.  Convexified Modularity Maximization for Degree-corrected Stochastic Block Models , 2015, The Annals of Statistics.

[40]  Afonso S. Bandeira,et al.  Random Laplacian Matrices and Convex Relaxations , 2015, Found. Comput. Math..

[41]  Anand D. Sarwate,et al.  STARK: Structured dictionary learning through rank-one tensor recovery , 2017, 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[42]  David Steurer,et al.  Exact tensor completion with sum-of-squares , 2017, COLT.

[43]  Ming Yuan,et al.  Incoherent Tensor Norms and Their Applications in Higher Order Tensor Completion , 2016, IEEE Transactions on Information Theory.

[44]  Jianqing Fan,et al.  An $\ell_{\infty}$ Eigenvector Perturbation Bound and Its Application , 2016, J. Mach. Learn. Res..

[45]  Chao Gao,et al.  Achieving Optimal Misclassification Proportion in Stochastic Block Models , 2015, J. Mach. Learn. Res..

[46]  van Vu,et al.  A Simple SVD Algorithm for Finding Hidden Partitions , 2014, Combinatorics, Probability and Computing.

[47]  Andrea Montanari,et al.  Spectral Algorithms for Tensor Completion , 2016, ArXiv.

[48]  Noureddine El Karoui,et al.  Asymptotics for high dimensional regression M-estimates: fixed design results , 2016, 1612.06358.

[49]  Anru Zhang,et al.  Minimax rate-optimal estimation of high-dimensional covariance matrices with incomplete data , 2016, J. Multivar. Anal..

[50]  Anru R. Zhang,et al.  Rate-Optimal Perturbation Bounds for Singular Subspaces with Applications to High-Dimensional Statistics , 2016, 1605.00353.

[51]  Yuxin Chen,et al.  Community Recovery in Graphs with Locality , 2016, ICML.

[52]  Tselil Schramm,et al.  Fast spectral algorithms from sum-of-squares proofs: tensor decomposition and planted sparse vectors , 2015, STOC.

[53]  Chao Gao,et al.  Optimal Estimation and Completion of Matrices with Biclustering Structures , 2016, J. Mach. Learn. Res..

[54]  Adel Javanmard,et al.  Phase transitions in semidefinite relaxations , 2015, Proceedings of the National Academy of Sciences.

[55]  Alexandre Proutière,et al.  Optimal Cluster Recovery in the Labeled Stochastic Block Model , 2015, NIPS.

[56]  Will Perkins,et al.  Spectral thresholds in the bipartite stochastic block model , 2015, COLT.

[57]  Andrea J. Goldsmith,et al.  Information Recovery From Pairwise Measurements , 2015, IEEE Transactions on Information Theory.

[58]  Bruce E. Hajek,et al.  Achieving Exact Cluster Recovery Threshold via Semidefinite Programming: Extensions , 2015, IEEE Transactions on Information Theory.

[59]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[60]  Ming Yuan,et al.  On Tensor Completion via Nuclear Norm Minimization , 2014, Foundations of Computational Mathematics.

[61]  Yonina C. Eldar,et al.  Subspace Learning with Partial Information , 2014, J. Mach. Learn. Res..

[62]  Sujay Sanghavi,et al.  Normalized Spectral Map Synchronization , 2016, NIPS.

[63]  M. Lelarge,et al.  Reconstruction in the Labelled Stochastic Block Model , 2015, IEEE Transactions on Network Science and Engineering.

[64]  Martin J. Wainwright,et al.  Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees , 2015, ArXiv.

[65]  Donggyu Kim,et al.  Asymptotic Theory for Estimating the Singular Vectors and Values of a Partially-observed Low Rank Matrix with Noise , 2015, 1508.05431.

[66]  Elchanan Mossel,et al.  Reconstruction and estimation in the planted partition model , 2012, Probability Theory and Related Fields.

[67]  Alexandra Kolla,et al.  Multisection in the Stochastic Block Model using Semidefinite Programming , 2015, ArXiv.

[68]  Dong Xia,et al.  Perturbation of linear forms of singular vectors under Gaussian noise , 2015 .

[69]  Ankur Moitra,et al.  Tensor Prediction, Rademacher Complexity and Random 3-XOR , 2015, ArXiv.

[70]  Anup Rao,et al.  Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery , 2015, COLT.

[71]  Zhi-Quan Luo,et al.  Guaranteed Matrix Completion via Non-Convex Factorization , 2014, IEEE Transactions on Information Theory.

[72]  Santosh S. Vempala,et al.  Subsampled Power Iteration: a Unified Algorithm for Block Models and Planted CSP's , 2015, NIPS.

[73]  Elchanan Mossel,et al.  Consistency Thresholds for the Planted Bisection Model , 2014, STOC.

[74]  A. Rinaldo,et al.  Consistency of spectral clustering in stochastic block models , 2013, 1312.2050.

[75]  Mark Rudelson,et al.  Delocalization of eigenvectors of random matrices with independent entries , 2013, 1306.2887.

[76]  T. Cai,et al.  Optimal estimation and rank detection for sparse spiked covariance matrices , 2013, Probability theory and related fields.

[77]  S. Chatterjee,et al.  Matrix estimation by Universal Singular Value Thresholding , 2012, 1212.1247.

[78]  Alexandre Proutière,et al.  Accurate Community Detection in the Stochastic Block Model via Spectral Algorithms , 2014, ArXiv.

[79]  Roman Vershynin,et al.  Community detection in sparse networks via Grothendieck’s inequality , 2014, Probability Theory and Related Fields.

[80]  Andrea Montanari,et al.  A statistical model for tensor PCA , 2014, NIPS.

[81]  Vladimir Koltchinskii,et al.  Asymptotics and Concentration Bounds for Bilinear Forms of Spectral Projectors of Sample Covariance , 2014, 1408.4643.

[82]  Elchanan Mossel,et al.  Consistency Thresholds for Binary Symmetric Block Models , 2014, ArXiv.

[83]  Elizaveta Levina,et al.  On semidefinite relaxations for the block model , 2014, ArXiv.

[84]  Prateek Jain,et al.  Provable Tensor Factorization with Missing Data , 2014, NIPS.

[85]  Tengyao Wang,et al.  A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.

[86]  Xiaodong Li,et al.  Robust and Computationally Feasible Community Detection in the Presence of Arbitrary Outlier Nodes , 2014, ArXiv.

[87]  Daniel B. Larremore,et al.  Efficiently inferring community structure in bipartite networks , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[88]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[89]  Bo Huang,et al.  Square Deal: Lower Bounds and Improved Relaxations for Tensor Recovery , 2013, ICML.

[90]  Eric L. Miller,et al.  Tensor-Based Formulation and Nuclear Norm Regularization for Multienergy Computed Tomography , 2013, IEEE Transactions on Image Processing.

[91]  Kathy J. Horadam,et al.  Community Detection in Bipartite Networks Using Random Walks , 2014, CompleNet.

[92]  V. Vu,et al.  Random perturbation of low rank matrices: Improving classical bounds , 2013, 1311.2657.

[93]  Massimiliano Pontil,et al.  A New Convex Relaxation for Tensor Completion , 2013, NIPS.

[94]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[95]  T. Cai,et al.  Sparse PCA: Optimal rates and adaptive estimation , 2012, 1211.1309.

[96]  Karim Lounici Sparse Principal Component Analysis with Missing Observations , 2012, 1205.7060.

[97]  Zongming Ma Sparse Principal Component Analysis and Iterative Thresholding , 2011, 1112.2432.

[98]  Visa Koivunen,et al.  Robust and sparse estimation of tensor decompositions , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[99]  Julie Josse,et al.  Handling missing values in exploratory multivariate data analysis methods , 2012 .

[100]  M. Yuan,et al.  Adaptive covariance matrix estimation through block thresholding , 2012, 1211.0459.

[101]  Karim Lounici High-dimensional covariance matrix estimation with missing observations , 2012, 1201.2577.

[102]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[103]  Jianfeng Yao,et al.  On sample eigenvalues in a generalized spiked population model , 2008, J. Multivar. Anal..

[104]  Tony Johansson The giant component of the random bipartite graph , 2012 .

[105]  Po-Ling Loh,et al.  High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity , 2011, NIPS.

[106]  Carey E. Priebe,et al.  A Consistent Adjacency Spectral Embedding for Stochastic Blockmodel Graphs , 2011, 1108.2228.

[107]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[108]  B. Recht,et al.  Tensor completion and low-n-rank tensor recovery via convex optimization , 2011 .

[109]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[110]  Van H. Vu Singular vectors under random perturbation , 2011, Random Struct. Algorithms.

[111]  David Gross,et al.  Recovering Low-Rank Matrices From Few Coefficients in Any Basis , 2009, IEEE Transactions on Information Theory.

[112]  A. Singer Angular Synchronization by Eigenvectors and Semidefinite Programming. , 2009, Applied and computational harmonic analysis.

[113]  Andrea Montanari,et al.  Matrix Completion from Noisy Entries , 2009, J. Mach. Learn. Res..

[114]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[115]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[116]  Visa Koivunen,et al.  Sequential Unfolding SVD for Tensors With Applications in Array Signal Processing , 2009, IEEE Transactions on Signal Processing.

[117]  Amin Coja-Oghlan,et al.  Graph Partitioning via Adaptive Spectral Techniques , 2009, Combinatorics, Probability and Computing.

[118]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[119]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[120]  B. Nadler Finite sample approximation results for principal component analysis: a matrix perturbation approach , 2009, 0901.3245.

[121]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[122]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[123]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[124]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[125]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[126]  Amin Coja-Oghlan,et al.  A spectral heuristic for bisecting random graphs , 2005, SODA '05.

[127]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[128]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[129]  G. Stewart,et al.  Matrix Perturbation Theory , 1990 .

[130]  J. Berge,et al.  Orthogonal procrustes rotation for two or more matrices , 1977 .

[131]  P. Wedin Perturbation theory for pseudo-inverses , 1973 .

[132]  P. Wedin Perturbation bounds in connection with singular value decomposition , 1972 .

[133]  W. Kahan,et al.  The Rotation of Eigenvectors by a Perturbation. III , 1970 .

[134]  Chandler Davis The rotation of eigenvectors by a perturbation , 1963 .

[135]  A. E. Maxwell,et al.  Factor Analysis as a Statistical Method. , 1964 .