Parametric and semiparametric reduced-rank regression with flexible sparsity

We consider joint rank and variable selection in multivariate regression. Previously proposed joint rank and variable selection approaches assume that different responses are related to the same set of variables, which suggests using a group penalty on the rows of the coefficient matrix. However, this assumption may not hold in practice and motivates the usual lasso ( l 1 ) penalty on the coefficient matrix. We propose to use the gradient-proximal algorithm to solve this problem, which is a recent development in optimization. We also present some theoretical results for the proposed estimator with the l 1 penalty. We then consider several extensions including adaptive lasso penalty, sparse group penalty, and additive models. The proposed methodology thus offers a much more complete set of tools in high-dimensional multivariate regression. Finally, we present numerical illustrations based on simulated and real data sets.

[1]  M. Yuan,et al.  On the non‐negative garrotte estimator , 2007 .

[2]  J. Zhu,et al.  On the degrees of freedom of reduced-rank estimators in multivariate regression. , 2012, Biometrika.

[3]  Ji Zhu,et al.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. , 2008, The annals of applied statistics.

[4]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[5]  Kung-Sik Chan,et al.  Reduced rank stochastic regression with a sparse singular value decomposition , 2012 .

[6]  Dimitri P. Bertsekas,et al.  Incremental proximal methods for large scale convex optimization , 2011, Math. Program..

[7]  Ming Yuan,et al.  Degrees of freedom in low rank matrix estimation , 2016 .

[8]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[9]  M. Wegkamp,et al.  Optimal selection of reduced rank estimators of high-dimensional matrices , 2010, 1004.2995.

[10]  M. Wegkamp,et al.  Joint variable and rank selection for parsimonious estimation of high-dimensional matrices , 2011, 1110.3556.

[11]  Shujie Ma,et al.  Reduced-rank Regression in Sparse Multivariate Varying-Coefficient Models with High-dimensional Covariates , 2013, 1309.6058.

[12]  Cun-Hui Zhang,et al.  The sparsity and bias of the Lasso selection in high-dimensional linear regression , 2008, 0808.0967.

[13]  Jianqing Fan,et al.  Nonconcave penalized likelihood with a diverging number of parameters , 2004, math/0406466.

[14]  Bin Cao,et al.  Encoding Low-Rank and Sparse Structures Simultaneously in Multi-task Learning , 2012 .

[15]  A. Izenman Reduced-rank regression for the multivariate linear model , 1975 .

[16]  Martin J. Wainwright,et al.  Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.

[17]  Yufeng Liu,et al.  Linear or Nonlinear? Automatic Structure Discovery for Partially Linear Models , 2011, Journal of the American Statistical Association.

[18]  R. Tibshirani,et al.  PATHWISE COORDINATE OPTIMIZATION , 2007, 0708.1485.

[19]  Cun-Hui Zhang,et al.  Adaptive Lasso for sparse high-dimensional regression models , 2008 .

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.

[22]  Jianhua Z. Huang,et al.  Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection , 2012 .

[23]  Noah Simon,et al.  A Sparse-Group Lasso , 2013 .

[24]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009 .

[25]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[26]  J. Horowitz,et al.  VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS. , 2010, Annals of statistics.

[27]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[28]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[29]  Brian J Reich,et al.  Surface Estimation, Variable Selection, and the Nonparametric Oracle Property. , 2011, Statistica Sinica.

[30]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[31]  Yang Feng,et al.  Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models , 2009, Journal of the American Statistical Association.

[32]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[33]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.