Sparse and Low-Rank Tensor Estimation via Cubic Sketchings

In this paper, we propose a general framework for sparse and low-rank tensor estimation from cubic sketchings. A two-stage non-convex implementation is developed based on sparse tensor decomposition and thresholded gradient descent, which ensures exact recovery in the noiseless case and stable recovery in the noisy case with high probability. The non-asymptotic analysis sheds light on an interplay between optimization error and statistical error. The proposed procedure is shown to be rate-optimal under certain conditions. As a technical by-product, novel high-order concentration inequalities are derived for studying high-moment sub-Gaussian tensors. An interesting tensor formulation illustrates the potential application to high-order interaction pursuit in high-dimensional linear regression.

[1]  Garvesh Raskutti,et al.  Convex regularization for high-dimensional multiresponse tensor regression , 2015, The Annals of Statistics.

[2]  Anru R. Zhang,et al.  Tensor SVD: Statistical and Computational Limits , 2017, IEEE Transactions on Information Theory.

[3]  Ming Yuan,et al.  On Tensor Completion via Nuclear Norm Minimization , 2014, Foundations of Computational Mathematics.

[4]  M. Talagrand THE SUPREMUM OF SOME CANONICAL PROCESSES , 1994 .

[5]  Xiaodong Li,et al.  Phase Retrieval via Wirtinger Flow: Theory and Algorithms , 2014, IEEE Transactions on Information Theory.

[6]  Reinhold Schneider,et al.  Low rank tensor recovery via iterative hard thresholding , 2016, ArXiv.

[7]  Anru Zhang,et al.  ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching , 2020, SIAM J. Math. Data Sci..

[8]  Xiaoshan Li,et al.  Tucker Tensor Regression and Neuroimaging Analysis , 2018, Statistics in Biosciences.

[9]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[10]  David P. Woodruff,et al.  Near Optimal Sketching of Low-Rank Tensor Regression , 2017, NIPS.

[11]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[12]  R. Adamczak,et al.  Restricted Isometry Property of Matrices with Independent Columns and Neighborly Polytopes by Random Sampling , 2009, 0904.4723.

[13]  Su-Yun Huang,et al.  Detection of gene–gene interactions using multistage sparse and low‐rank regression , 2016, Biometrics.

[14]  Marie Frei,et al.  Decoupling From Dependence To Independence , 2016 .

[15]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[16]  Anima Anandkumar,et al.  Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-1 Updates , 2014, ArXiv.

[17]  Anru Zhang,et al.  Cross: Efficient Low-rank Tensor Completion , 2016, The Annals of Statistics.

[18]  Ming Yuan,et al.  Non-Convex Projected Gradient Descent for Generalized Low-Rank Tensor Regression , 2016, J. Mach. Learn. Res..

[19]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[20]  Yingying Fan,et al.  Interaction Pursuit with Feature Screening and Selection , 2016, 1605.08933.

[21]  A. Sayed,et al.  Foundations and Trends ® in Machine Learning > Vol 7 > Issue 4-5 Ordering Info About Us Alerts Contact Help Log in Adaptation , Learning , and Optimization over Networks , 2011 .

[22]  Shmuel Friedland,et al.  Nuclear norm of higher-order tensors , 2014, Math. Comput..

[23]  Zemin Zhang,et al.  Exact Tensor Completion Using t-SVD , 2015, IEEE Transactions on Signal Processing.

[24]  Ming Yuan,et al.  Incoherent Tensor Norms and Their Applications in Higher Order Tensor Completion , 2016, IEEE Transactions on Information Theory.

[25]  Pawel Hitczenko,et al.  Moment inequalities for sums of certain independent symmetric random variables , 1997 .

[26]  Anima Anandkumar,et al.  Score Function Features for Discriminative Learning: Matrix and Tensor Framework , 2014, ArXiv.

[27]  Johan Håstad,et al.  Tensor Rank is NP-Complete , 1989, ICALP.

[28]  Pablo A. Parrilo,et al.  Rank-Sparsity Incoherence for Matrix Decomposition , 2009, SIAM J. Optim..

[29]  Anru Zhang,et al.  ROP: Matrix Recovery via Rank-One Projections , 2013, ArXiv.

[30]  H. Vincent Poor,et al.  Nonconvex Low-Rank Symmetric Tensor Completion from Noisy Data , 2019, NeurIPS 2019.

[31]  Massimiliano Pontil,et al.  Multilinear Multitask Learning , 2013, ICML.

[32]  Andrea Montanari,et al.  Spectral Algorithms for Tensor Completion , 2016, ArXiv.

[33]  Xiaodong Li,et al.  Optimal Rates of Convergence for Noisy Sparse Phase Retrieval via Thresholded Wirtinger Flow , 2015, ArXiv.

[34]  Hongtu Zhu,et al.  Tensor Regression with Applications in Neuroimaging Data Analysis , 2012, Journal of the American Statistical Association.

[35]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[36]  Bo Huang,et al.  Square Deal: Lower Bounds and Improved Relaxations for Tensor Recovery , 2013, ICML.

[37]  Dan Schonfeld,et al.  Compressive Sensing of Sparse Tensors , 2013, IEEE Transactions on Image Processing.

[38]  Andrea Montanari,et al.  Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.

[39]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[40]  Xin Zhang,et al.  Parsimonious Tensor Response Regression , 2015, 1501.07815.

[41]  Anastasios Kyrillidis,et al.  Multi-Way Compressed Sensing for Sparse Low-Rank Tensors , 2012, IEEE Signal Processing Letters.

[42]  P. Kroonenberg Applied Multiway Data Analysis , 2008 .

[43]  Zhaoran Wang,et al.  OPTIMAL COMPUTATIONAL AND STATISTICAL RATES OF CONVERGENCE FOR SPARSE NONCONVEX LEARNING PROBLEMS. , 2013, Annals of statistics.

[44]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[45]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.

[46]  Robert Bogucki Suprema of canonical Weibull processes , 2015 .

[47]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[48]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[49]  Bin Yu Assouad, Fano, and Le Cam , 1997 .

[50]  Ning Hao,et al.  Interaction Screening for Ultrahigh-Dimensional Data , 2014, Journal of the American Statistical Association.

[51]  Po-Ling Loh,et al.  Regularized M-estimators with nonconvexity: statistical and algorithmic theory for local optima , 2013, J. Mach. Learn. Res..

[52]  M. Yuan,et al.  Convex Regularization for High-Dimensional Tensor Regression , 2015 .

[53]  P. Diaconis,et al.  Use of exchangeable pairs in the analysis of simulations , 2004 .

[54]  Max Simchowitz,et al.  Low-rank Solutions of Linear Matrix Equations via Procrustes Flow , 2015, ICML.

[55]  Baoxin Li,et al.  Tensor completion for on-board compression of hyperspectral images , 2010, 2010 IEEE International Conference on Image Processing.

[56]  Andrea Montanari,et al.  A statistical model for tensor PCA , 2014, NIPS.

[57]  Lexin Li,et al.  STORE: Sparse Tensor Response Regression and Neuroimaging Analysis , 2016, J. Mach. Learn. Res..

[58]  M. Ledoux The concentration of measure phenomenon , 2001 .

[59]  Trac D. Tran,et al.  Tensor sparsification via a bound on the spectral norm of random tensors , 2010, ArXiv.

[60]  Andrzej Cichocki,et al.  Multidimensional compressed sensing and their applications , 2013, WIREs Data Mining Knowl. Discov..

[61]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[62]  James B. Brown,et al.  Iterative random forests to discover predictive and stable high-order interactions , 2017, Proceedings of the National Academy of Sciences.

[63]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[64]  Han Liu,et al.  Provable sparse tensor decomposition , 2015, 1502.01425.

[65]  Özgür Yilmaz,et al.  Near-optimal sample complexity for convex tensor completion , 2017, Information and Inference: A Journal of the IMA.

[66]  Andrea J. Goldsmith,et al.  Exact and Stable Covariance Estimation From Quadratic Sampling via Convex Programming , 2013, IEEE Transactions on Information Theory.

[67]  Minh N. Do,et al.  Efficient Tensor Completion for Color Image and Video Recovery: Low-Rank Tensor Train , 2016, IEEE Transactions on Image Processing.

[68]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[69]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Demetri Terzopoulos,et al.  Multilinear subspace analysis of image ensembles , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..