Comparison of Adaboost and ADTboost for Feature Subset Selection

This paper addresses the problem of feature selection within classification processes. We present a comparison of a feature subset selection with respect to two boosting methods, Adaboost and ADTboost. In our evaluation, we have focused on three different criteria: the classification error and the efficiency of the process depending on the number of most appropriate features and the number of training samples. Therefore, we discuss both techniques and sketch their functionality, where we restrict both boosting approaches to linear weak classifiers. We propose a feature subset selection method, which we evaluate on synthetic and on benchmark data sets.

[1]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[2]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[3]  Geoff Holmes,et al.  Multiclass Alternating Decision Trees , 2002, ECML.

[4]  E. B. Andersen,et al.  Information Science and Statistics , 1986 .

[5]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[6]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[7]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[8]  Martin Drauschke,et al.  Feature Subset Selection with Adaboost and ADTboost , 2008 .

[9]  Dunja Mladenic,et al.  Feature Selection for Dimensionality Reduction , 2005, SLSFS.

[10]  W. Förstner,et al.  SELECTING APPROPRIATE FEATURES FOR DETECTING BUILDINGS AND BUILDING PARTS , 2008 .

[11]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[12]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  M.J. Martin-Bautista,et al.  A survey of genetic feature selection in mining issues , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[15]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[16]  Steve R. Gunn,et al.  Identifying Feature Relevance Using a Random Forest , 2005, SLSFS.