Sparse PCA from Inaccurate and Incomplete Measurements

We consider the problem of recovering an unknown effectively sparse low-rank matrix from a small set of incomplete and inaccurate linear measurements of the form $y = \mathcal A (X) + η$, where $η$ is an {\it ineliminable} noise. This problem encompasses and fuses two important classes of machine learning and signal processing challenges: sparse principal component analysis and compressed sensing. More specifically, we aim at recovering low-rank-$R$ matrices with effectively $(s_1,s_2)$-sparse non-orthogonal rank-$1$ decompositions. We describe an optimization problem for matrix recovery under the considered model and propose a novel algorithm, called {\it \textbf{A}lternating \textbf{T}ikhonov regularization and \textbf{Las}so} (A-T-LA$\rm{S}_{2,1}$), to solve it. The algorithm is based on a multi-penalty regularization, which is able to leverage both structures (low-rankness and sparsity) simultaneously. The algorithm is a fast first order method, and straightforward to implement. We prove global convergence for {\it any} linear measurement model to stationary points and local convergence to global minimizers of the multi-penalty objective functional. Global minimizers balance effective sparsity of their rank-$1$ decompositions and the fidelity to data, up to noise level. By adapting the concept of restricted isometry property from compressed sensing to our novel model class, we prove error bounds between global minimizers and ground truth, up to noise level, from a number of subgaussian measurements scaling as $R(s_1+s_2)$, up to log-factors in the dimension, and relative-to-diameter distortion. Simulation results demonstrate both the accuracy and efficacy of the algorithm, as well as its superiority to the state-of-the-art algorithms in strong noise regimes and for matrices, whose singular vectors do not possess exact (joint-) sparse support.

[1]  T. M. Cannon,et al.  Blind deconvolution through digital signal processing , 1975, Proceedings of the IEEE.

[2]  D. Godard,et al.  Self-Recovering Equalization and Carrier Tracking in Two-Dimensional Data Communication Systems , 1980, IEEE Trans. Commun..

[3]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[4]  Zhi Ding,et al.  Blind Equalization and Identification , 2001 .

[5]  R. Jagannathan,et al.  Risk Reduction in Large Portfolios: Why Imposing the Wrong Constraints Helps , 2002 .

[6]  Liviu Badea,et al.  Sparse factorizations of gene expression data guided by binding data. , 2005, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[7]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[8]  Amnon Shashua,et al.  Nonnegative Sparse PCA , 2006, NIPS.

[9]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[10]  James Bennett,et al.  The Netflix Prize , 2007 .

[11]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[12]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[13]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[14]  Hédy Attouch,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..

[15]  Emmanuel J. Candès,et al.  Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements , 2011, IEEE Transactions on Information Theory.

[16]  Yaniv Plan,et al.  One‐Bit Compressed Sensing by Linear Programming , 2011, ArXiv.

[17]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[18]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[19]  Holger Rauhut,et al.  Suprema of Chaos Processes and the Restricted Isometry Property , 2012, ArXiv.

[20]  Prateek Jain,et al.  Low-rank matrix completion using alternating minimization , 2012, STOC '13.

[21]  V. Naumova,et al.  Minimization of multi-penalty functionals by alternating iterative thresholding and optimal parameter choices , 2014, 1403.6718.

[22]  Justin K. Romberg,et al.  Blind Deconvolution Using Convex Programming , 2012, IEEE Transactions on Information Theory.

[23]  Yaniv Plan,et al.  Dimension Reduction by Random Hyperplane Tessellations , 2014, Discret. Comput. Geom..

[24]  Justin K. Romberg,et al.  Near-Optimal Estimation of Simultaneously Sparse and Low-Rank Matrices from Nested Linear Measurements , 2015, ArXiv.

[25]  Yonina C. Eldar,et al.  Simultaneously Structured Models With Application to Sparse and Low-Rank Matrices , 2012, IEEE Transactions on Information Theory.

[26]  I. Daubechies,et al.  Sparsity-enforcing regularisation and ISTA revisited , 2016 .

[27]  M. Grasmair,et al.  Conditions on optimal support recovery in unmixing problems by means of multi-penalty regularization , 2016, 1601.01461.

[28]  Thomas Strohmer,et al.  Blind Deconvolution Meets Blind Demixing: Algorithms and Performance Bounds , 2015, IEEE Transactions on Information Theory.

[29]  Timo Klock,et al.  Adaptive multi-penalty regularization based on a generalized Lasso path , 2017, Applied and Computational Harmonic Analysis.

[30]  Thomas Strohmer,et al.  Regularized Gradient Descent: A Nonconvex Recipe for Fast Joint Blind Deconvolution and Demixing , 2017, ArXiv.

[31]  F. Krahmer,et al.  Refined performance guarantees for Sparse Power Factorization , 2017, 2017 International Conference on Sampling Theory and Applications (SampTA).

[32]  Yoram Bresler,et al.  Near-Optimal Compressed Sensing of a Class of Sparse Low-Rank Matrices Via Sparse Power Factorization , 2013, IEEE Transactions on Information Theory.

[33]  Peter Jung,et al.  Blind Demixing and Deconvolution at Near-Optimal Rate , 2017, IEEE Transactions on Information Theory.