Boosting is a strong ensemble-based learning algorithm with the promise of iteratively improving the classification accuracy using any base learner, as long as it satisfies the condition of yielding weighted accuracy > 0.5. In this paper, we analyze boosting with respect to this basic condition on the base learner, to see if boosting ensures prediction of rarely occurring events with high recall and precision. First we show that a base learner can satisfy the required condition even for poor recall or precision levels, especially for very rare classes. Furthermore, we show that the intelligent weight updating mechanism in boosting, even in its strong cost-sensitive form, does not prevent cases where the base learner always achieves high precision but poor recall or high recall but poor precision, when mapped to the original distribution. In either of these cases, we show that the voting mechanism of boosting falls to achieve good overall recall and precision for the ensemble. In effect, our analysis indicates that one cannot be blind to the base learner performance, and just rely on the boosting mechanism to take care of its weakness. We validate our arguments empirically on variety of real and synthetic rare class problems. In particular, using AdaCost as the boosting algorithm, and variations of PNrule and RIPPER as the base learners, we show that if algorithm A achieves better recall-precision balance than algorithm B, then using A as the base learner in AdaCost yields significantly better performance than using B as the base learner.
[1]
Vipin Kumar,et al.
Evaluating boosting algorithms to classify rare classes: comparison and improvements
,
2001,
Proceedings 2001 IEEE International Conference on Data Mining.
[2]
Kai Ming Ting,et al.
A Comparative Study of Cost-Sensitive Boosting Algorithms
,
2000,
ICML.
[3]
William W. Cohen.
Fast Effective Rule Induction
,
1995,
ICML.
[4]
Salvatore J. Stolfo,et al.
AdaCost: Misclassification Cost-Sensitive Boosting
,
1999,
ICML.
[5]
Yoram Singer,et al.
A simple, fast, and effective rule learner
,
1999,
AAAI 1999.
[6]
L. Breiman.
Arcing Classifiers
,
1998
.
[7]
Catherine Blake,et al.
UCI Repository of machine learning databases
,
1998
.
[8]
Vipin Kumar,et al.
Mining needle in a haystack: classifying rare classes via two-phase rule induction
,
2001,
SIGMOD '01.
[9]
Salvatore J. Stolfo,et al.
Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection
,
1998,
KDD.
[10]
Yoram Singer,et al.
Improved Boosting Algorithms Using Confidence-rated Predictions
,
1998,
COLT' 98.
[11]
Chris Buckley,et al.
OHSUMED: an interactive retrieval evaluation and new large test collection for research
,
1994,
SIGIR '94.
[12]
M. Chial,et al.
in simple
,
2003
.
[13]
Umesh V. Vazirani,et al.
An Introduction to Computational Learning Theory
,
1994
.