Repechage Bootstrap Aggregating for Misclassification Cost Reduction

This paper examines the use of bootstrap aggregating (bagging) with classifier learning methods based upon hold-out pruning (or growing) for misclassification cost reduction. Both decision tree and rule set classifiers are used. The paper introduces a “repechange” variation of bagging, that uses, as the hold-out data for cost reduction, the “out of bag” items, which would be unused in standard bagging. The paper presents experimental evidence that, when used with the hold-out cost reduction methods, the repechage, method can achieve better misclassification cost results than the straightforward use of standard bagging used with the same hold-out cost reduction method. Superior results for the repechange method on some problems with previously defined cost matrices are shown for a cost reduction decision tree method and two cost reduction rule set methods.

[1]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[2]  R. Mike Cameron-Jones,et al.  Efficient top-down induction of logic programs , 1994, SGAR.

[3]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[4]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[5]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[6]  Kai Ming Ting,et al.  Boosting Trees for Cost-Sensitive Classifications , 1998, ECML.

[7]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[8]  R. Mike Cameron-Jones,et al.  The Complexity of Batch Approaches to Reduced Error Rule Set Induction , 1996, PRICAI.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Michael J. Pazzani,et al.  An Investigation of Noise-Tolerant Relational Concept Learning Algorithms , 1991, ML.

[12]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[13]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[14]  J. Ross Quinlan,et al.  Simplifying Decision Trees , 1987, Int. J. Man Mach. Stud..

[15]  Geoffrey I. Webb Cost-Sensitive Specialization , 1996, PRICAI.

[16]  William W. Cohen Efficient Pruning Methods for Separate-and-Conquer Rule Learning Systems , 1993, IJCAI.

[17]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[18]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.