Stacking for Misclassification Cost Performance

This paper investigates the application of the multiple classifier technique known as "stacking" [23], to the task of classifier learning for misclassification cost performance, by straightforwardly adapting a technique successfully developed by Ting and Witten [19, 20] for the task of classifier learning for accuracy performance. Experiments are reported comparing the performance of the stacked classifier with that of its component classifiers, and of other proposed cost-sensitive multiple classifier methods - a variation of "bagging", and two "boosting" style methods. These experiments conform that stacking is competitive with the other methods that have previously been proposed. Some further experiments examine the performance of stacking methods with different numbers of component classifiers, including the case of stacking a single classifier, and provide the first demonstration that stacking a single classifier can be beneficial for many data sets.

[1]  Kai Ming Ting,et al.  Boosting Trees for Cost-Sensitive Classifications , 1998, ECML.

[2]  G DietterichThomas An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees , 2000 .

[3]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[4]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[5]  Kai Ming Ting,et al.  An Empirical Study of MetaCost Using Boosting Algorithms , 2000, ECML.

[6]  Ian H. Witten,et al.  Issues in Stacked Generalization , 2011, J. Artif. Intell. Res..

[7]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[8]  Kai Ming Ting,et al.  A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.

[9]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[10]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[11]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[12]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[13]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  Ian H. Witten,et al.  Stacked generalization: when does it work? , 1997, IJCAI 1997.

[16]  R. Mike Cameron-Jones,et al.  Repechage Bootstrap Aggregating for Misclassification Cost Reduction , 1998, PRICAI.

[17]  Robert C. Holte,et al.  Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria , 2000, ICML.