An Empirical evaluation of CostBoost Extensions for Cost-Sensitive Classification

Data Mining Technique, namely Classification, is used to predict group membership for data samples. Ensemble learning, combining multiple classifiers using bagging, boosting or stacking, are proven data-mining methods, we have used boosting in this paper for combining multiple classifiers. Cost-sensitive classification is used for classification tasks under the Cost-Based Model (CBM), unlike the Error-Based Model (EBM). EBM does not incorporate the cost of misclassifying a sample in a model building phase, while CBM does. CBM techniques usually modify the weight update equation to incorporate the misclassification cost from cost-matrix. Cost-sensitive boosters are studied and three new extensions of CostBoost algorithm CBE1, CBE2 and CBE3 are proposed and compared with existing cost based boosting classifiers. CSE1, CSE2 and CSE3 outperformed the original CostBoost by 5%, 4% and 4% respectively, in terms of misclassification cost.

[1]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[2]  Jalpesh Vasa,et al.  Cost-Sensitive Decision Tree Induction for Feature Selection and Sequential Minimal Optimisation for Classification: CSattribSelectorC4.5() , 2013 .

[3]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[4]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[7]  M. Zweig,et al.  Prevalence-value-accuracy plots: a new method for comparing diagnostic tests based on misclassification costs. , 1999, Clinical chemistry.

[8]  Nuno Vasconcelos,et al.  Cost-Sensitive Boosting , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Kai Ming Ting,et al.  Boosting Trees for Cost-Sensitive Classifications , 1998, ECML.

[10]  Kai Ming Ting,et al.  Boosting Cost-Sensitive Trees , 1998, Discovery Science.

[11]  Kai Ming Ting,et al.  A Comparative Study of Cost-Sensitive Boosting Algorithms , 2000, ICML.

[12]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[13]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[14]  Ankit Desai,et al.  An Empirical Evaluation of Adaboost Extensions for Cost-Sensitive Classification , 2012 .

[15]  John Langford,et al.  Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.

[16]  Sunil Vadera,et al.  An empirical comparison of cost‐sensitive decision tree induction algorithms , 2011, Expert Syst. J. Knowl. Eng..

[17]  Geoffrey I. Webb Cost-Sensitive Specialization , 1996, PRICAI.

[18]  Kai Ming Ting,et al.  An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .

[19]  Sunil Vadera,et al.  A survey of cost-sensitive decision tree induction algorithms , 2013, CSUR.

[20]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.