Combining Different Methods and Numbers of Weak Decision Trees

Abstract: Several ways of manipulating a training set have shown that weakened classifier combination can improve prediction accuracy. In the present paper, we focus on learning set sampling (Breiman’s Bagging) and random feature subset selections (Ho’s Random Subspaces). We present a combination scheme labelled ‘Bagfs’, in which new learning sets are generated on the basis of both bootstrap replicates and random subspaces. The performances of the three methods (Bagging, Random Subspaces and Bagfs) are compared to the standard Adaboost algorithm. All four methods are assessed by means of a decision-tree inducer (C4.5). In addition, we also study whether the number and the way in which they are created has a significant influence on the performance of their combination. To answer these two questions, we undertook the application of the McNemar test of significance and the Kappa degree-of-agreement. The results, obtained on 23 conventional databases, show that on average, Bagfs exhibits the best agreement between prediction and supervision.

[1]  Thomas G. Dietterich,et al.  Applying the Waek Learning Framework to Understand and Improve C4.5 , 1996, ICML.

[2]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[3]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[4]  Zijian Zheng,et al.  Generating Classifier Commitees by Stochastically Selecting both Attributes and Training Examples , 1998, PRICAI.

[5]  Wenxin Jiang,et al.  Some Theoretical Aspects of Boosting in the Presence of Noisy Data , 2001, ICML.

[6]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.

[7]  L. Breiman Random Forests--random Features , 1999 .

[8]  Michael Perrone,et al.  Putting It All Together: Methods for Combining Neural Networks , 1993, NIPS.

[9]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[10]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[11]  G. H. Rosenfield,et al.  A coefficient of agreement as a measure of thematic classification accuracy. , 1986 .

[12]  Chuanyi Ji,et al.  Combinations of Weak Classifiers , 1996, NIPS.

[13]  Robert P. W. Duin,et al.  Experiments with Classifier Combining Rules , 2000, Multiple Classifier Systems.

[14]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[15]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[16]  Kagan Tumer,et al.  Classifier Combining: Analytical Results and Implications , 1995 .

[17]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[18]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Stephen D. Bay Nearest neighbor classification from multiple feature subsets , 1999, Intell. Data Anal..

[21]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[22]  Ron Kohavi,et al.  Option Decision Trees with Majority Votes , 1997, ICML.

[23]  Roberto Battiti,et al.  Democracy in neural nets: Voting schemes for classification , 1994, Neural Networks.

[24]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[25]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[26]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[27]  Bernard R. Rosner,et al.  Fundamentals of Biostatistics. , 1992 .

[28]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[29]  Louisa Lam,et al.  Classifier Combinations: Implementations and Theoretical Issues , 2000, Multiple Classifier Systems.

[30]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[31]  Tin Kam Ho Data Complexity Analysis for Classifier Combination , 2001, Multiple Classifier Systems.

[32]  Ching Y. Suen,et al.  A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  A. Berger FUNDAMENTALS OF BIOSTATISTICS , 1969 .

[34]  Michael J. Pazzani,et al.  Error reduction through learning multiple descriptions , 2004, Machine Learning.