Trace regression model with simultaneously low rank and row(column) sparse parameter

Abstract In this paper, we consider the trace regression model with matrix covariates, where the parameter is a matrix of simultaneously low rank and row(column) sparse. To estimate the parameter, we formulate a convex optimization problem with the nuclear norm and group Lasso penalties, and propose an alternating direction method of multipliers (ADMM) algorithm. The asymptotic properties of the estimator are established. Simulation results confirm the effectiveness of our method.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[3]  J. Christensen,et al.  Fluorescence spectroscopy and PARAFAC in the analysis of yogurt , 2005 .

[4]  Massimiliano Pontil,et al.  Taking Advantage of Sparsity in Multi-Task Learning , 2009, COLT.

[5]  Junzhou Huang,et al.  The Benefit of Group Sparsity , 2009 .

[6]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[7]  Wenxuan Zhong,et al.  Matrix Discriminant Analysis With Application to Colorimetric Sensor Array Data , 2015, Technometrics.

[8]  A. Rinaldo,et al.  On the asymptotic properties of the group lasso estimator for linear models , 2008 .

[9]  Nicolas Vayatis,et al.  Estimation of Simultaneously Sparse and Low Rank Matrices , 2012, ICML.

[10]  Emmanuel J. Candès,et al.  Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements , 2011, IEEE Transactions on Information Theory.

[11]  Hung Hung,et al.  Matrix variate logistic regression model with application to EEG data. , 2011, Biostatistics.

[12]  Z. Bai,et al.  Limit of the smallest eigenvalue of a large dimensional sample covariance matrix , 1993 .

[13]  Yonina C. Eldar,et al.  Simultaneously Structured Models With Application to Sparse and Low-Rank Matrices , 2012, IEEE Transactions on Information Theory.

[14]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[15]  S. Geer,et al.  Oracle Inequalities and Optimal Inference under Group Sparsity , 2010, 1007.1771.

[16]  Hongtu Zhu,et al.  Tensor Regression with Applications in Neuroimaging Data Analysis , 2012, Journal of the American Statistical Association.

[17]  Jieping Ye,et al.  Sparse trace norm regularization , 2012, Comput. Stat..

[18]  Lexin Li,et al.  Regularized matrix regression , 2012, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[19]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[20]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[21]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[22]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[23]  Jian Huang,et al.  Consistent group selection in high-dimensional linear regression. , 2010, Bernoulli : official journal of the Bernoulli Society for Mathematical Statistics and Probability.

[24]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[25]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[26]  Bin Cao,et al.  Encoding Low-Rank and Sparse Structures Simultaneously in Multi-task Learning , 2012 .

[27]  Martin J. Wainwright,et al.  Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls , 2009, IEEE Transactions on Information Theory.

[28]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[29]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[30]  M. Wegkamp,et al.  Optimal selection of reduced rank estimators of high-dimensional matrices , 2010, 1004.2995.

[31]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[32]  Cun-Hui Zhang,et al.  Rate Minimaxity of the Lasso and Dantzig Selector for the lq Loss in lr Balls , 2010, J. Mach. Learn. Res..

[33]  S. Mendelson,et al.  Uniform Uncertainty Principle for Bernoulli and Subgaussian Ensembles , 2006, math/0608665.

[34]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[35]  Han Liu,et al.  Estimation Consistency of the Group Lasso and its Applications , 2009, AISTATS.

[36]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.