Experimental study for the comparison of classifier combination methods

In this paper, we compare the performances of classifier combination methods (bagging, modified random subspace method, classifier selection, parametric fusion) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are: (a) combination function among input variables, (b) correlation between input variables, (c) variance of observation, and (d) training data set size. In view of typically unknown combination function among input variables, we use a Taguchi design to improve the practicality of our study results by letting it as an uncontrollable factor. Our experimental study results indicate the following: when training set size is large, performances of logistic regression and bagging are not significantly different. However, when training set size is small, the performance of logistic regression is worse than bagging. When training data set size is small and correlation is strong, both modified random subspace method and bagging perform better than the other three methods. When correlation is weak and variance is small, both parametric fusion and classifier selection algorithm appear to be the worst at our disappointment.

[1]  Y N Sun,et al.  A self-learning segmentation framework--the Taguchi approach. , 2000, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society.

[2]  Linda M. Haines,et al.  Optimal Design for Neural Networks , 1998 .

[3]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[4]  Baozong Yuan,et al.  Multiple classifiers combination by clustering and selection , 2001, Inf. Fusion.

[5]  Kevin W. Bowyer,et al.  Combination of Multiple Classifiers Using Local Accuracy Estimates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Robert P. W. Duin,et al.  Bagging for linear classifiers , 1998, Pattern Recognit..

[7]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[8]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[9]  B. S. Lim,et al.  Optimal design of neural networks using the Taguchi method , 1995, Neurocomputing.

[10]  So Young Sohn,et al.  Data fusion, ensemble and clustering to improve the classification accuracy for the severity of road traffic accidents in Korea , 2003 .

[11]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[12]  Ludmila I. Kuncheva,et al.  Clustering-and-selection model for classifier combination , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[13]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[14]  So Young Sohn,et al.  Meta Analysis of Classification Algorithms for Pattern Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Majid Ahmadi,et al.  Recognition of handwritten numerals with multiple feature and multistage classifier , 1995, Pattern Recognit..

[16]  W. Shannon,et al.  Combining classification trees using MLE. , 1999, Statistics in medicine.

[17]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[18]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[19]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  K L McFadden,et al.  Predicting pilot-error incidents of US airline pilots using logistic regression. , 1997, Applied ergonomics.

[21]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  L. Breiman Arcing Classifiers , 1998 .

[23]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .