Switching class labels to generate classification ensembles

Ensembles that combine the decisions of classifiers generated by using perturbed versions of the training set where the classes of the training examples are randomly switched can produce a significant error reduction, provided that large numbers of units and high class switching rates are used. The classifiers generated by this procedure have statistically uncorrelated errors in the training set. Hence, the ensembles they form exhibit a similar dependence of the training error on ensemble size, independently of the classification problem. In particular, for binary classification problems, the classification performance of the ensemble on the training data can be analysed in terms of a Bernoulli process. Experiments on several UCI datasets demonstrate the improvements in classification accuracy that can be obtained using these class-switching ensembles.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Edward J. Delp,et al.  An iterative growing and pruning algorithm for classification tree design , 1989, Conference Proceedings., IEEE International Conference on Systems, Man and Cybernetics.

[3]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[4]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[5]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[6]  Gonzalo Martínez-Muñoz,et al.  Using all data to generate decision tree ensembles , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[7]  Leo Breiman,et al.  Randomizing Outputs to Increase Prediction Accuracy , 2000, Machine Learning.

[8]  KohaviRon,et al.  An Empirical Comparison of Voting Classification Algorithms , 1999 .

[9]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[10]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[11]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[12]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[13]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  L. Breiman Arcing Classifiers , 1998 .

[16]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[17]  Amanda J. C. Sharkey,et al.  Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems , 1999 .

[18]  Torsten Hothorn,et al.  Double-Bagging: Combining Classifiers by Bootstrap Aggregation , 2002, Pattern Recognit..

[19]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[21]  Robert P. W. Duin,et al.  Bagging for linear classifiers , 1998, Pattern Recognit..

[22]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[23]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[24]  C. Sitthi-amorn,et al.  Bias , 1993, The Lancet.

[25]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[26]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.