Boosting Feature Selection

It is possible to reduce the error rate of a single classifier using a classifier ensemble. However, any gain in performance is undermined by the increased computation of performing classification several times. Here the AdaboostFS algorithm is proposed which builds on two popular areas of ensemble research: Adaboost and Ensemble Feature Selection (EFS). The aim of AdaboostFS is to reduce the number of features used by each base classifer and hence the overall computation required by the ensemble. To do this the algorithm combines a regularised version of Boosting AdaboostReg [1] with a floating feature search for each base classifier. AdaboostFS is compared using four benchmark data sets to AdaboostAll, which uses all features and to AdaboostRSM, which uses a random selection of features. Performance is assessed based on error rate, ensemble error and diversity, and the total number of features used for classification. Results show that AdaboostFS achieves a lower error rate and higher diversity than AdaboostAll, and achieves a lower error rate and comparable diversity to AdaboostRSM. However, over the other methods AdaboostFS produces a significant reduction in the number of features required for classification in each base classifier and the entire ensemble.

[1]  Darrell Whitley,et al.  Feature Selection Mechanisms for Ensemble Creation : A Genetic Search Perspective , 2003 .

[2]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[3]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[4]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[5]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[6]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[7]  Enric Plaza,et al.  Machine Learning: ECML 2000 , 2003, Lecture Notes in Computer Science.

[8]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[9]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[10]  Mykola Pechenizkiy,et al.  Diversity in search strategies for ensemble feature selection , 2005, Inf. Fusion.

[11]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[12]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[13]  Xin Yao,et al.  Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[14]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[15]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[16]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Horst Bunke,et al.  Feature selection algorithms for the generation of multiple classifier systems and their application to handwritten word recognition , 2004 .

[18]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[19]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.