Efficient Global Approximation of Generalized Nonlinear ℓ1-Regularized Solution Paths and Its Applications

We consider efficient construction of nonlinear solution paths for general ℓ1-regularization. Unlike the existing methods that incrementally build the solution path through a combination of local linear approximation and recalibration, we propose an efficient global approximation to the whole solution path. With the loss function approximated by a quadratic spline, we show that the solution path can be computed using a generalized Lars algorithm. The proposed methodology avoids high-dimensional numerical optimization and thus provides faster and more stable computation. The methodology also can be easily extended to more general regularization framework. We illustrate such flexibility with several examples, including a generalization of the elastic net and a new method that effectively exploits the so-called “support vectors” in kernel logistic regression.

[1]  Mee Young Park,et al.  L1‐regularization path algorithm for generalized linear models , 2007 .

[2]  Mee Young Park,et al.  L 1-regularization path algorithm for generalized linear models , 2006 .

[3]  Ji Zhu,et al.  L1-Norm Quantile Regression , 2008 .

[4]  S. Rosset,et al.  Piecewise linear regularized solution paths , 2007, 0708.2197.

[5]  Peng Zhao,et al.  Stagewise Lasso , 2007, J. Mach. Learn. Res..

[6]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[7]  G. Wahba Spline models for observational data , 1990 .

[8]  Saharon Rosset,et al.  Following Curved Regularized Optimization Solution Paths , 2004, NIPS.

[9]  Yi Lin Tensor product space ANOVA models , 2000 .

[10]  Robert Tibshirani,et al.  The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..

[11]  George G. Lorentz,et al.  Constructive Approximation , 1993, Grundlehren der mathematischen Wissenschaften.

[12]  Terrence J. Sejnowski,et al.  Analysis of hidden units in a layered network trained to classify sonar targets , 1988, Neural Networks.

[13]  Meta M. Voelker,et al.  Variable Selection and Model Building via Likelihood Basis Pursuit , 2004 .

[14]  Runze Li,et al.  Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery , 2006, math/0602133.

[15]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[18]  Ji Zhu,et al.  Kernel Logistic Regression and the Import Vector Machine , 2001, NIPS.

[19]  M. R. Osborne,et al.  A new approach to variable selection in least squares problems , 2000 .

[20]  H. Zou,et al.  One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. , 2008, Annals of statistics.

[21]  Xiwu Lin,et al.  Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV , 2000 .

[22]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[23]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[24]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .