Fast Variable Selection by Block Addition and Block Deletion

We propose the threshold updating method for terminating variable selection and two variable selection methods. In the threshold updating method, we update the threshold value when the approximation error smaller than the current threshold value is obtained. The first variable selection method is the combination of forward selection by block addition and backward selection by block deletion. In this method, starting from the empty set of the input variables, we add several input variables at a time until the approximation error is below the threshold value. Then we search deletable variables by block deletion. The second method is the combination of the first method and variable selection by Linear Programming Support Vector Regressors (LPSVRs). By training an LPSVR with linear kernels, we evaluate the weights of the decision function and delete the input variables whose associated absolute weights are zero. Then we carry out block addition and block deletion. By computer experiments using benchmark data sets, we show that the proposed methods can perform faster variable selection than the method only using block deletion, and that by the threshold updating method, the approximation error is lower than that by the fixed threshold method. We also compare our method with an imbedded method, which determines the optimal variables during training, and show that our method gives comparable or better variable selection performance.

[1]  Alain Rakotomamonjy,et al.  Analysis of SVM regression bounds for variable ranking , 2007, Neurocomputing.

[2]  Glenn Fung,et al.  A Feature Selection Newton Method for Support Vector Machine Classification , 2004, Comput. Optim. Appl..

[3]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[4]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[5]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[6]  Michel Verleysen,et al.  Mutual information for the selection of relevant variables in spectrometric nonlinear modelling , 2006, ArXiv.

[7]  Le Song,et al.  Supervised feature selection via dependence estimation , 2007, ICML '07.

[8]  Shigeo Abe,et al.  Neural Networks and Fuzzy Systems: Theory and Applications , 2012 .

[9]  Shigeo Abe Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[10]  Gunnar Rätsch,et al.  Predicting Time Series with Support Vector Machines , 1997, ICANN.

[11]  Shigeo Abe,et al.  Neural Networks and Fuzzy Systems , 1996, Springer US.

[12]  Shigeo Abe,et al.  Backward Varilable Selection of Support Vector Regressors by Block Deletion , 2007, 2007 International Joint Conference on Neural Networks.

[13]  Vojislav Kecman,et al.  LP and QP based learning from empirical data , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[14]  Johan A. K. Suykens,et al.  Least squares support vector machines for classification and nonlinear modelling , 2000 .

[15]  Michel Verleysen,et al.  Resampling methods for parameter-free and robust feature selection with mutual information , 2007, Neurocomputing.

[16]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..