Boosting ridge regression

There are several possible approaches to combining ridge regression with boosting techniques. In the simple or naive approach the ridge estimator is used to fit iteratively the current residuals yielding an alternative to the usual ridge estimator. In partial boosting only part of the regression parameters are reestimated within one step of the iterative procedure. The technique allows to distinguish between mandatory variables that are always included in the analysis and optional variables that are chosen only if relevant. The resulting procedure selects optional variables in a similar way as the Lasso, yielding a reduced set of influential variables, while allowing for regularized estimation of the mandatory parameters. The suggested procedures are investigated within the classical framework of continuous response variables as well as in the case of generalized linear models. The performance in terms of prediction and the identification of relevant variables is compared to several competitors as the Lasso and the more recently proposed elastic net. For the evaluation of the identification of relevant variables pseudo ROC curves are introduced.

[1]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[2]  T. Stamey,et al.  Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. II. Radical prostatectomy treated patients. , 1989, The Journal of urology.

[3]  Leo Breiman,et al.  Prediction Games and Arcing Algorithms , 1999, Neural Computation.

[4]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[5]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[6]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[7]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[8]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[9]  B. Peter BOOSTING FOR HIGH-DIMENSIONAL LINEAR MODELS , 2006 .

[10]  Wenjiang J. Fu Penalized Regressions: The Bridge versus the Lasso , 1998 .

[11]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[12]  G. Tutz,et al.  Generalized Additive Modeling with Implicit Variable Selection by Likelihood‐Based Boosting , 2006, Biometrics.

[13]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[14]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[15]  P. Bühlmann Boosting for high-dimensional linear models , 2006 .

[16]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[17]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[18]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[19]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[20]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[21]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .