STRUCTURED LASSO FOR REGRESSION WITH MATRIX COVARIATES

High-dimensional matrix data are common in modern data analysis. Simply applying Lasso after vectorizing the observations ignores essential row and column information inherent in such data, rendering variable selection results less useful. In this paper, we propose a new approach that takes advantage of the structural information. The estimate is easy to compute and possesses favorable theoretical properties. Compared with Lasso, the new estimate can recover the sparse structure in both rows and columns under weaker assumptions. Simulations demonstrate its better performance in variable selection and convergence rate, compared to methods that ignore such information. An application to a dataset in medical science shows the usefulness of the proposal.

[1]  Shuheng Zhou Restricted Eigenvalue Conditions on Subgaussian Random Matrices , 2009, 0912.4045.

[2]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[3]  Chenlei Leng,et al.  Sparse Matrix Graphical Models , 2012 .

[4]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[5]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, NIPS 2004.

[6]  N. Altman,et al.  On dimension folding of matrix- or array-valued statistical objects , 2010, 1002.4789.

[7]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[10]  Martin J. Wainwright,et al.  Restricted Eigenvalue Properties for Correlated Gaussian Designs , 2010, J. Mach. Learn. Res..

[11]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[12]  S. Mendelson,et al.  Reconstruction and Subgaussian Operators in Asymptotic Geometric Analysis , 2007 .

[13]  M. Wainwright,et al.  High-dimensional analysis of semidefinite relaxations for sparse principal components , 2008, 2008 IEEE International Symposium on Information Theory.

[14]  Rasmus Bro,et al.  Analysis of lipoproteins using 2D diffusion-edited NMR spectroscopy and multi-way chemometrics , 2005 .

[15]  M. Talagrand,et al.  Probability in Banach spaces , 1991 .

[16]  S. Mendelson,et al.  Uniform Uncertainty Principle for Bernoulli and Subgaussian Ensembles , 2006, math/0608665.