论文信息 - Bagging with Asymmetric Costs for Misclassified and Correctly Classified Examples

Bagging with Asymmetric Costs for Misclassified and Correctly Classified Examples

Diversity is a key characteristic to obtain advantages of combining predictors. In this paper, we propose a modification of bagging to explicitly trade off diversity and individual accuracy. The procedure consists in dividing the bootstrap replicates obtained at each iteration of the algorithm in two subsets: one consisting of the examples misclassified by the ensemble obtained at the previous iteration, and the other consisting of the examples correctly recognized. A high individual accuracy of a new classifier on the first subset increases diversity, measured as the value of the Q statistic between the new classifier and the existing classifier ensemble. A high accuracy on the second subset on the other hand, decreases diversity. We trade off between both components of the individual accuracy using a parameter λ ∈ [0, 1] that changes the cost of a misclassification on the second subset. Experiments are provided using well-known classification problems obtained from UCI. Results are also compared with boosting and bagging.

Claudio Moraga | Héctor Allende | Ricardo Ñanculef | Carlos Valle

[1] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[2] Yves Grandvalet. Bagging down-weights leverage points , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[3] Ludmila I. Kuncheva,et al. That Elusive Diversity in Classifier Ensembles , 2003, IbPRIA.

[4] Eric Bauer,et al. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[5] Yves Grandvalet,et al. Bagging Equalizes Influence , 2004, Machine Learning.

[6] Ludmila I. Kuncheva,et al. Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[7] Gavin Brown,et al. Negative Correlation Learning and the Ambiguity Family of Ensemble Methods , 2003, Multiple Classifier Systems.

[8] Subhash C. Bagui,et al. Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[9] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[10] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[11] T. Poggio,et al. Bagging Regularizes , 2002 .

[12] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.

[13] Xin Yao,et al. Diversity creation methods: a survey and categorisation , 2004, Inf. Fusion.

[14] Yoav Freund,et al. A Short Introduction to Boosting , 1999 .

[15] Ludmila I. Kuncheva,et al. Combining Pattern Classifiers: Methods and Algorithms , 2004 .