Cost-Sensitive Learning

Cost-sensitive learning is an aspect of algorithm-level modifications for class imbalance. Here, instead of using a standard error-driven evaluation (or 0–1 loss function), a misclassification cost is being introduced in order to minimize the conditional risk. By strongly penalizing mistakes on some classes, we improve their importance during classifier training step. This pushes decision boundaries away from their instances, leading to improved generalization on these classes. In this chapter we will discuss the basics of cost-sensitive methods, introduce their taxonomy, and describe how to deal with scenarios in which misclassification cost is not given beforehand by an expert. Then we will describe most popular cost-sensitive classifiers and talk about the potential for hybridization with other techniques. Section 4.1 offers background and taxonomy of cost-sensitive classification algorithms. The important issue of how to obtain the cost matrix is discussed in Sect. 4.2. Section 4.3 describes MetaCost, a popular wrapper approach for adapting any classifier to a cost-sensitive setting, while Sect. 4.4 discusses various aspects of cost-sensitive decision trees. Other cost-sensitive classification models are described in Sect. 4.5, while Sect. 4.6 shows the potential advantages of using hybrid cost-sensitive algorithms. Finally Sect. 4.7 concludes this chapter and presents future challenges in the field of cost-sensitive solutions to class imbalance.

[1]  David A. Cieslak,et al.  Automatically countering imbalance and its empirical relationship to cost , 2008, Data Mining and Knowledge Discovery.

[2]  Gerald Schaefer,et al.  A hybrid cost-sensitive ensemble for imbalanced breast thermogram classification , 2015, Artif. Intell. Medicine.

[3]  Bao-Gang Hu,et al.  A New Strategy of Cost-Free Learning in the Class Imbalance Problem , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Chengqi Zhang,et al.  Cost-Sensitive Classification with k-Nearest Neighbors , 2013, KSEM.

[5]  Kai Ming Ting,et al.  An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .

[6]  Stan Matwin,et al.  Ensembles of label noise filters: a ranking approach , 2016, Data Mining and Knowledge Discovery.

[7]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[8]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[9]  Marlon Núñez,et al.  The Use of Background Knowledge in Decision Tree Induction , 1991, Machine Learning.

[10]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[11]  Zhi-Hua Zhou,et al.  The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study , 2006, Sixth International Conference on Data Mining (ICDM'06).

[12]  Mohamed F. Ghalwash,et al.  Cost Sensitive Time-Series Classification , 2017, ECML/PKDD.

[13]  Harikrishna Narasimhan,et al.  Support Vector Algorithms for Optimizing the Partial Area under the ROC Curve , 2016, Neural Computation.

[14]  Bartosz Krawczyk,et al.  Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets , 2016, Pattern Recognit..

[15]  Bartosz Krawczyk,et al.  Influence of minority class instance types on SMOTE imbalanced data oversampling , 2017, LIDTA@PKDD/ECML.

[16]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[17]  Yong Luo,et al.  Cost-Sensitive Feature Selection by Optimizing F-Measures , 2018, IEEE Transactions on Image Processing.

[18]  Bartosz Krawczyk,et al.  Designing Cost-Sensitive Ensemble - Genetic Approach , 2011, IP&C.

[19]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[20]  Francisco Herrera,et al.  Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics , 2012, Expert Syst. Appl..

[21]  Michal Wozniak,et al.  Cost-sensitive methods of constructing hierarchical classifiers , 2010, Expert Syst. J. Knowl. Eng..

[22]  Shichao Zhang,et al.  "Missing is useful": missing values in cost-sensitive decision trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[23]  Fulufhelo Vincent Nelwamondo,et al.  Applying Cost-Sensitive Classification for Financial Fraud Detection under High Class-Imbalance , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[24]  Yue Xu,et al.  Cost-sensitive and hybrid-attribute measure multi-decision tree over imbalanced data sets , 2018, Inf. Sci..

[25]  Annalisa Riccardi,et al.  Cost-Sensitive AdaBoost Algorithm for Ordinal Regression Based on Extreme Learning Machine , 2014, IEEE Transactions on Cybernetics.

[26]  José Martínez Sotoca,et al.  Resampling Methods versus Cost Functions for Training an MLP in the Class Imbalance Context , 2011, ISNN.

[27]  Jing Zhang,et al.  Large cost-sensitive margin distribution machine for imbalanced data classification , 2017, Neurocomputing.

[28]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[29]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[30]  Akiko Takeda,et al.  Robust Cost Sensitive Support Vector Machine , 2015, AISTATS.

[31]  Robert P. W. Duin,et al.  Efficient Multiclass ROC Approximation by Decomposition via Confusion Matrix Perturbation Analysis , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[33]  Zhoujun Li,et al.  Applying adaptive over-sampling technique based on data density and cost-sensitive SVM to imbalanced learning , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[34]  Gholamreza Nakhaeizadeh,et al.  Cost-Sensitive Pruning of Decision Trees , 1994, ECML.

[35]  Charles X. Ling,et al.  Hybrid Cost-Sensitive Decision Tree , 2005, PKDD.

[36]  Robert E. Schapire,et al.  On reoptimizing multi-class classifiers , 2008, Machine Learning.

[37]  Pavel Paclík,et al.  The ROC skeleton for multiclass ROC estimation , 2010, Pattern Recognit. Lett..

[38]  Zhenbing Liu,et al.  Cost-Sensitive Collaborative Representation Based Classification via Probability Estimation Addressing the Class Imbalance Problem , 2018 .

[39]  Dazhe Zhao,et al.  An Optimized Cost-Sensitive SVM for Imbalanced Data Learning , 2013, PAKDD.

[40]  David J. Hand,et al.  Choosing k for two-class nearest neighbour classifiers with unbalanced classes , 2003, Pattern Recognit. Lett..

[41]  Robert P. W. Duin,et al.  Approximating the multiclass ROC by pairwise analysis , 2007, Pattern Recognit. Lett..

[42]  Jun Du,et al.  Cost-Sensitive Decision Trees with Pre-pruning , 2007, Canadian Conference on AI.

[43]  Bartosz Krawczyk,et al.  Cost-Sensitive Splitting and Selection Method for Medical Decision Support System , 2012, IDEAL.

[44]  Bartosz Krawczyk,et al.  Cost-Sensitive Perceptron Decision Trees for Imbalanced Drifting Data Streams , 2017, ECML/PKDD.

[45]  Tao Li,et al.  Cost-sensitive feature selection using random forest: Selecting low-cost subsets of informative features , 2016, Knowl. Based Syst..

[46]  Robert Sabourin,et al.  The Multiclass ROC Front method for cost-sensitive classification , 2016, Pattern Recognit..

[47]  Yves Lecourtier,et al.  A multi-model selection framework for unknown and/or evolutive misclassification cost problems , 2010, Pattern Recognit..

[48]  Dmitry O. Gorodnichy,et al.  Skew-sensitive boolean combination for adaptive ensembles - An application to face recognition in video surveillance , 2014, Inf. Fusion.

[49]  Qiang Yang,et al.  Decision trees with minimal costs , 2004, ICML.

[50]  Robert P. W. Duin,et al.  Cost-Based Classifier Evaluation for Imbalanced Problems , 2004, SSPR/SPR.

[51]  Nello Cristianini,et al.  Controlling the Sensitivity of Support Vector Machines , 1999 .

[52]  Bartosz Krawczyk,et al.  Cost-Sensitive Neural Network with ROC-Based Moving Threshold for Imbalanced Classification , 2015, IDEAL.

[53]  Stan Matwin,et al.  Resampling and Cost-Sensitive Methods for Imbalanced Multi-instance Learning , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[54]  Gerald Schaefer,et al.  Cost-sensitive decision tree ensembles for effective imbalanced classification , 2014, Appl. Soft Comput..

[55]  Beatrice Lazzerini,et al.  Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets , 2010, Soft Comput..

[56]  Jonathan E. Fieldsend,et al.  Multi-class ROC analysis from a multi-objective optimisation perspective , 2006, Pattern Recognit. Lett..

[57]  Swagatam Das,et al.  Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs , 2015, Neural Networks.

[58]  Francisco Herrera,et al.  Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..

[59]  Hong Zhao,et al.  Cost-sensitive decision tree with probabilistic pruning mechanism , 2015, 2015 International Conference on Machine Learning and Cybernetics (ICMLC).

[60]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[61]  Bartosz Krawczyk Cost-sensitive one-vs-one ensemble for multi-class imbalanced data , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[62]  Peter A. Flach,et al.  Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.

[63]  Liangxiao Jiang,et al.  Randomly selected decision tree for test-cost sensitive learning , 2017, Appl. Soft Comput..

[64]  Yuxin Peng,et al.  Adaptive Sampling with Optimal Cost for Class-Imbalance Learning , 2015, AAAI.

[65]  Antônio de Pádua Braga,et al.  Novel Cost-Sensitive Approach to Improve the Multilayer Perceptron Performance on Imbalanced Data , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[66]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[67]  Robert M. Nishikawa,et al.  Optimization and FROC analysis of rule-based detection schemes using a multiobjective approach , 1998, IEEE Transactions on Medical Imaging.

[68]  Foster Provost,et al.  Machine Learning from Imbalanced Data Sets 101 , 2008 .