Cost-sensitive learning for defect escalation

While most software defects (i.e., bugs) are corrected and tested as part of the prolonged software development cycle, enterprise software venders often have to release software products before all reported defects are corrected, due to deadlines and limited resources. A small number of these reported defects will be escalated by customers whose businesses are seriously impacted. Escalated defects must be resolved immediately and individually by the software vendors at a very high cost. The total costs can be even greater, including loss of reputation, satisfaction, loyalty, and repeat revenue. In this paper, we develop a Software defecT Escalation Prediction (STEP) system to mine historical defect report data and predict the escalation risk of current defect reports for maximum net profit. More specifically, we first describe a simple and general framework to convert the maximum net profit problem to cost-sensitive learning. We then apply and compare four well-known cost-sensitive learning approaches for STEP. Our experiments suggest that cost-sensitive decision trees (CSTree) is the best methods for producing the highest positive net profit.

[1]  Victor S. Sheng,et al.  Thresholding for Making Classifiers Cost-sensitive , 2006, AAAI.

[2]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[3]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[4]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[5]  Rich Caruana,et al.  Obtaining Calibrated Probabilities from Boosting , 2005, UAI.

[6]  Zhi-Hua Zhou,et al.  ON MULTI‐CLASS COST‐SENSITIVE LEARNING , 2006, Comput. Intell..

[7]  Vipin Kumar,et al.  Mining needle in a haystack: classifying rare classes via two-phase rule induction , 2001, SIGMOD '01.

[8]  Lise Getoor,et al.  Cost-sensitive learning with conditional Markov networks , 2006, Data Mining and Knowledge Discovery.

[9]  Gary M. Weiss Mining with rarity: a unifying framework , 2004, SKDD.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Naoki Abe,et al.  Multi-class cost-sensitive boosting with p-norm loss functions , 2008, KDD.

[12]  Qiang Yang,et al.  Decision trees with minimal costs , 2004, ICML.

[13]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[14]  Foster Provost,et al.  Machine Learning from Imbalanced Data Sets 101 , 2008 .

[15]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[16]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[17]  Charles X. Ling,et al.  Software Escalation Prediction with Data Mining , 2004 .

[18]  Peter D. Turney Types of Cost in Inductive Concept Learning , 2002, ArXiv.

[19]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[20]  Thomas G. Dietterich,et al.  Bootstrap Methods for the Cost-Sensitive Evaluation of Classifiers , 2000, ICML.

[21]  Ulf Brefeld,et al.  Support Vector Machines with Example Dependent Costs , 2003, ECML.

[22]  Barry Boehm,et al.  Top 10 list [software development] , 2001 .

[23]  Victor S. Sheng,et al.  Maximum profit mining and its application in software development , 2006, KDD '06.

[24]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[25]  Thomas G. Dietterich,et al.  Pruning Improves Heuristic Search for Cost-Sensitive Learning , 2002, ICML.

[26]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[27]  Barry W. Boehm,et al.  Software Engineering Economics , 1993, IEEE Transactions on Software Engineering.

[28]  Cigdem Demir,et al.  Test-Cost Sensitive Classification Based on Conditioned Loss Functions , 2007, ECML.

[29]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[30]  Carla E. Brodley,et al.  Pruning Decision Trees with Misclassification Costs , 1998, ECML.

[31]  Nathalie Japkowicz,et al.  Concept-Learning in the Presence of Between-Class and Within-Class Imbalances , 2001, Canadian Conference on AI.

[32]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[33]  Gholamreza Nakhaeizadeh,et al.  Cost-Sensitive Pruning of Decision Trees , 1994, ECML.

[34]  Foster J. Provost,et al.  Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction , 2003, J. Artif. Intell. Res..

[35]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[36]  Tom Fawcett,et al.  Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions , 1997, KDD.

[37]  Foster J. Provost,et al.  Active feature-value acquisition for classifier induction , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[38]  John Langford,et al.  Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.

[39]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[40]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[41]  Bianca Zadrozny,et al.  Learning and making decisions when costs and probabilities are both unknown , 2001, KDD '01.

[42]  Kai Ming Ting,et al.  An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .

[43]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..