Stochastic Stepwise Ensembles for Variable Selection

In this article, we advocate the ensemble approach for variable selection. We point out that the stochastic mechanism used to generate the variable-selection ensemble (VSE) must be picked with care. We construct a VSE using a stochastic stepwise algorithm and compare its performance with numerous state-of-the-art algorithms. Supplemental materials for the article are available online.

[1]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[2]  Trevor Hastie,et al.  Additive Logistic Regression : a Statistical , 1998 .

[3]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[4]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[5]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[6]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[7]  A. Atkinson Subset Selection in Regression , 1992 .

[8]  J. Ames,et al.  Variable Inclusion and Shrinkage Algorithms , 2008 .

[9]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[10]  Yuhong Yang PREDICTION/ESTIMATION WITH SIMPLE LINEAR MODELS: IS IT REALLY THAT SIMPLE? , 2006, Econometric Theory.

[11]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[14]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[15]  Yuhong Yang Can the Strengths of AIC and BIC Be Shared , 2005 .

[16]  David L. Donoho,et al.  De-noising by soft-thresholding , 1995, IEEE Trans. Inf. Theory.

[17]  Sijian Wang,et al.  RANDOM LASSO. , 2011, The annals of applied statistics.

[18]  Ludmila I. Kuncheva,et al.  Genetic Algorithm for Feature Selection for Parallel Classifiers , 1993, Inf. Process. Lett..

[19]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[20]  Nicolai Meinshausen,et al.  Relaxed Lasso , 2007, Comput. Stat. Data Anal..

[21]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[22]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[23]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[24]  Mu Zhu,et al.  Kernels and Ensembles : Perspectives on Statistical Learning , 2008 .

[25]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[26]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[27]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[28]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[29]  I. Jolliffe Principal Component Analysis , 2002 .

[30]  Jeffrey S. Morris,et al.  Sure independence screening for ultrahigh dimensional feature space Discussion , 2008 .

[31]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[32]  Elizaveta Levina,et al.  Discussion of "Stability selection" by N. Meinshausen and P. Buhlmann , 2010 .

[33]  J. Maindonald Statistical Learning from a Regression Perspective , 2008 .

[34]  Shifeng Xiong,et al.  Better subset regression , 2012, 1212.0634.

[35]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[36]  D. Madigan,et al.  [Least Angle Regression]: Discussion , 2004 .

[37]  Alan J. Miller Subset Selection in Regression , 1992 .

[38]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[39]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[40]  Bogdan E. Popescu,et al.  Importance Sampled Learning Ensembles , 2003 .

[41]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[42]  Mu Zhu,et al.  Darwinian Evolution in Parallel Universes: A Parallel Genetic Algorithm for Variable Selection , 2006, Technometrics.

[43]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..