Simultaneous Variable Selection

We propose a new method for selecting a common subset of explanatory variables where the aim is to model several response variables. The idea is a natural extension of the LASSO technique proposed by Tibshirani (1996) and is based on the (joint) residual sum of squares while constraining the parameter estimates to lie within a suitable polyhedral region. The properties of the resulting convex programming problem are analyzed for the special case of an orthonormal design. For the general case, we develop an efficient interior point algorithm. The method is illustrated on a dataset with infrared spectrometry measurements on 14 qualitatively different but correlated responses using 770 wavelengths. The aim is to select a subset of the wavelengths suitable for use as predictors for as many of the responses as possible.

[1]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[2]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[3]  Edward Leamer Regression Selection Strategies and Revealed Priors , 1978 .

[4]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[5]  M. R. Osborne Finite Algorithms in Optimization and Data Analysis , 1985 .

[6]  F. Santosa,et al.  Linear inversion of ban limit reflection seismograms , 1986 .

[7]  M. R. Osborne,et al.  On Linear Restricted and Interval Least-Squares Problems , 1988 .

[8]  L. Dixon,et al.  Finite Algorithms in Optimization and Data Analysis. , 1988 .

[9]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[10]  A. Atkinson Subset Selection in Regression , 1992 .

[11]  M. R. Osborne An effective method for computing regression quantiles , 1992 .

[12]  Sanjay Mehrotra,et al.  On the Implementation of a Primal-Dual Interior Point Method , 1992, SIAM J. Optim..

[13]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[14]  T. Hastie,et al.  [A Statistical View of Some Chemometrics Regression Tools]: Discussion , 1993 .

[15]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[16]  R. Carroll Measurement, Regression, and Calibration , 1994 .

[17]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[18]  L. Gleser Measurement, Regression, and Calibration , 1996 .

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  Trevor Hastie,et al.  Predicting multivariate responses in multiple linear regression - Discussion , 1997 .

[21]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[22]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[23]  C. Braak Discussion to 'Predicting multivariate responses in multiple linear regression' by L. Breiman & J.H. Friedman , 1997 .

[24]  Jean-Philippe Vial,et al.  Theory and algorithms for linear optimization - an interior point approach , 1998, Wiley-Interscience series in discrete mathematics and optimization.

[25]  Yinyu Ye,et al.  Interior point algorithms: theory and analysis , 1997 .

[26]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[27]  T. Fearn,et al.  Multivariate Bayesian variable selection and prediction , 1998 .

[28]  T. Fearn,et al.  The choice of variables in multivariate regression: a non-conjugate Bayesian decision theory approach , 1999 .

[29]  David R. Anderson,et al.  Model Selection and Inference: A Practical Information-Theoretic Approach , 2001 .

[30]  Sergey Bakin,et al.  Adaptive regression and model selection in data mining problems , 1999 .

[31]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[32]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[33]  M. R. Osborne,et al.  On the LASSO and its Dual , 2000 .

[34]  David R. Anderson,et al.  Model selection and inference : a practical information-theoretic approach , 2000 .

[35]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[36]  T. Fearn,et al.  Bayesian Wavelet Regression on Curves With Application to a Spectroscopic Calibration Problem , 2001 .

[37]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[38]  T. Fearn,et al.  Bayes model averaging with selection of regressors , 2002 .

[39]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[40]  R. R. Hocking Methods and Applications of Linear Models: Regression and the Analysis of Variance , 2003 .

[41]  F. Huang Prediction Error Property of the Lasso Estimator and its Generalization , 2003 .

[42]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[43]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[44]  B. Turlach Discussion of "Least Angle Regression" by Efron, Hastie, Johnstone and Tibshirani , 2004 .

[45]  S. Rosset,et al.  Corrected proof of the result of ‘A prediction error property of the Lasso estimator and its generalization’ by Huang (2003) , 2004 .

[46]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[47]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[48]  Berwin A. Turlach,et al.  On algorithms for solving least squares problems under an L1 penalty or an L1 constraint , 2005 .

[49]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[50]  M. Forina,et al.  Multivariate calibration. , 2007, Journal of chromatography. A.