论文信息 - Unified Low-Rank Matrix Estimate via Penalized Matrix Least Squares Approximation

Unified Low-Rank Matrix Estimate via Penalized Matrix Least Squares Approximation

Low-rank matrix estimation arises in a number of statistical and machine learning tasks. In particular, the coefficient matrix is considered to have a low-rank structure in multivariate linear regression and multivariate quantile regression. In this paper, we propose a method called penalized matrix least squares approximation (PMLSA) toward a unified yet simple low-rank matrix estimate. Specifically, PMLSA can transform many different types of low-rank matrix estimation problems into their asymptotically equivalent least-squares forms, which can be efficiently solved by a popular matrix fast iterative shrinkage-thresholding algorithm. Furthermore, we derive analytic degrees of freedom for PMLSA, with which a Bayesian information criterion (BIC)-type criterion is developed to select the tuning parameters. The estimated rank based on the BIC-type criterion is verified to be asymptotically consistent with the true rank under mild conditions. Extensive experimental studies are performed to confirm our assertion.

[1] Jianming Ye. On Measuring and Correcting the Effects of Data Mining and Model Selection , 1998 .

[2] P. Holland,et al. Robust regression using iteratively reweighted least-squares , 1977 .

[3] Konstantina Papagiannaki,et al. Structural analysis of network traffic flows , 2004, SIGMETRICS '04/Performance '04.

[4] Johan A. K. Suykens,et al. Weighted least squares support vector machines: robustness and sparse approximation , 2002, Neurocomputing.

[5] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.

[6] Zongben Xu,et al. Folded-concave penalization approaches to tensor completion , 2015, Neurocomputing.

[7] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[8] Y. She,et al. Robust reduced-rank regression , 2015, Biometrika.

[9] M. Wegkamp,et al. Optimal selection of reduced rank estimators of high-dimensional matrices , 2010, 1004.2995.

[10] Y. Selen,et al. Model-order selection: a review of information criterion rules , 2004, IEEE Signal Processing Magazine.

[11] Xuelong Li,et al. Multivariate Multilinear Regression , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.

[13] Stuart Barber,et al. All of Statistics: a Concise Course in Statistical Inference , 2005 .

[14] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[15] Moshe Buchinsky. Estimating the asymptotic covariance matrix for quantile regression models a Monte Carlo study , 1995 .

[16] Ding-Xuan Zhou,et al. Distributed Kernel-Based Gradient Descent Algorithms , 2018 .

[17] Johan A. K. Suykens,et al. Robust Low-Rank Tensor Recovery With Regularized Redescending M-Estimator , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[18] Gregory Piatetsky-Shapiro,et al. High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[19] Sumio Watanabe,et al. A widely applicable Bayesian information criterion , 2012, J. Mach. Learn. Res..

[20] Heng Lian,et al. Robust reduced-rank modeling via rank regression , 2017 .

[21] L. Breiman. Heuristics of instability and stabilization in model selection , 1996 .

[22] Qionghai Dai,et al. Low-Rank Structure Learning via Nonconvex Heuristic Recovery , 2010, IEEE Transactions on Neural Networks and Learning Systems.

[23] Jiangjun Peng,et al. Hyperspectral Image Restoration Via Total Variation Regularized Low-Rank Tensor Decomposition , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[24] S. Yun,et al. An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[25] M. Yuan,et al. Dimension reduction and coefficient estimation in multivariate linear regression , 2007 .

[26] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[27] R. Tibshirani,et al. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[28] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[29] J. Zhu,et al. On the degrees of freedom of reduced-rank estimators in multivariate regression. , 2012, Biometrika.

[30] Martin J. Wainwright,et al. A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[31] Xuelong Li,et al. Robust Alternative Minimization for Matrix Completion , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[32] E. Fama,et al. Industry costs of equity , 1997 .

[33] Robert Tibshirani,et al. Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[34] Tae-Hwan Kim,et al. Modeling Autoregressive Conditional Skewness and Kurtosis with Multi-Quantile CAViaR , 2008, SSRN Electronic Journal.

[35] Kung-Sik Chan,et al. Reduced rank regression via adaptive nuclear norm penalization. , 2012, Biometrika.

[36] Enrique Herrera-Viedma,et al. Clustering of web search results based on the cuckoo search algorithm and Balanced Bayesian Information Criterion , 2014, Inf. Sci..

[37] John Wright,et al. Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization , 2009, NIPS.

[38] P. Bühlmann,et al. Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana , 2004, Genome Biology.

[39] Bhaskar D. Rao,et al. Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[40] Geoffrey J. Gordon,et al. Relational learning via collective matrix factorization , 2008, KDD.

[41] J. Schmee. An Introduction to Multivariate Statistical Analysis , 1986 .

[42] Bin Yu,et al. Asymptotic Properties of Lasso+mLS and Lasso+Ridge in Sparse High-dimensional Linear Regression , 2013, 1306.5505.

[43] Volker Roth,et al. The generalized LASSO , 2004, IEEE Transactions on Neural Networks.

[44] I. Daubechies,et al. Iteratively reweighted least squares minimization for sparse recovery , 2008, 0807.0575.

[45] Chenlei Leng,et al. Unified LASSO Estimation by Least Squares Approximation , 2007 .

[46] Babak Nadjar Araabi,et al. Improved Bayesian information criterion for mixture model selection , 2016, Pattern Recognit. Lett..