Explainable Software Defect Prediction: Are We There Yet?

Explaining the results of defect prediction models is practical but challenging to achieve. Recently, Jiarpakdee et al. [1] proposed to use two state-of-the-art model-agnostic techniques (i.e., LIME and BreakDown) to explain prediction results. Their study showed that model-agnostic techniques can achieve remarkable performance, and the generated explanations can assist developers to understand the prediction results. However, the fact that they only examined both LIME and BreakDown in a single defect prediction setting calls into question the consistency and reliability of model-agnostic techniques on defect prediction models under various

[1]  Tao Zhang,et al.  Cross-version defect prediction via hybrid active learning with kernel principal component analysis , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[2]  Ilya Sutskever,et al.  Learning to Generate Reviews and Discovering Sentiment , 2017, ArXiv.

[3]  Andreas Zeller,et al.  Predicting faults from cached history , 2008, ISEC '08.

[4]  Hoh Peter In,et al.  Micro interaction metrics for defect prediction , 2011, ESEC/FSE '11.

[5]  Zhi-Hua Zhou,et al.  Sample-based software defect prediction with active and semi-supervised learning , 2012, Automated Software Engineering.

[6]  Sinno Jialin Pan,et al.  Transfer defect learning , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[8]  Jane Cleland-Huang,et al.  Cold-Start Software Analytics , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[9]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[10]  Przemyslaw Biecek,et al.  Explanations of model predictions with live and breakDown packages , 2018, R J..

[11]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[12]  Carlos Guestrin,et al.  Anchors: High-Precision Model-Agnostic Explanations , 2018, AAAI.

[13]  José Javier Dolado,et al.  Preliminary comparison of techniques for dealing with imbalance in software defect prediction , 2014, EASE '14.

[14]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[15]  Reem Aleithan Explainable Just-In-Time Bug Prediction: Are We There Yet? , 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[16]  Hideaki Hata,et al.  Predicting Defective Lines Using a Model-Agnostic Technique , 2020, IEEE Transactions on Software Engineering.

[17]  Morakot Choetkiertikul,et al.  JITBot: An Explainable Just-In-Time Defect Prediction Bot , 2020, 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  John Grundy,et al.  SQAPlanner: Generating Data-Informed Software Quality Improvement Plans , 2021, IEEE Transactions on Software Engineering.

[19]  Jacky W. Keung,et al.  Investigation on the stability of SMOTE-based oversampling techniques in software defect prediction , 2021, Inf. Softw. Technol..

[20]  Yan Gao,et al.  Software Defect Prediction based on Adaboost algorithm under Imbalance Distribution , 2016 .

[21]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[22]  David Lo,et al.  File-Level Defect Prediction: Unsupervised vs. Supervised Models , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[23]  Hoa Khanh Dam,et al.  An Empirical Study of Model-Agnostic Techniques for Defect Prediction Models , 2020, IEEE Transactions on Software Engineering.

[24]  Khanh Hoa Dam,et al.  An Explainable Deep Model for Defect Prediction , 2019, 2019 IEEE/ACM 7th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE).

[25]  Sousuke Amasaki Cross-version defect prediction: use historical data, cross-project data, or both? , 2020, Empirical Software Engineering.

[26]  Nachiappan Nagappan,et al.  Continuous Software Bug Prediction , 2021, ESEM.

[27]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28]  Zhaowei Shang,et al.  Tackling class overlap and imbalance problems in software defect prediction , 2018, Software Quality Journal.

[29]  Shane McIntosh,et al.  Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[30]  Ye Yang,et al.  An investigation on the feasibility of cross-project defect prediction , 2012, Automated Software Engineering.

[31]  Tian Jiang,et al.  Personalized defect prediction , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[32]  Xiaohong Su,et al.  An Empirical Study on Software Defect Prediction Using Over-Sampling by SMOTE , 2018, Int. J. Softw. Eng. Knowl. Eng..

[33]  Audris Mockus,et al.  A large-scale empirical study of just-in-time quality assurance , 2013, IEEE Transactions on Software Engineering.

[34]  Xin Yao,et al.  Using Class Imbalance Learning for Software Defect Prediction , 2013, IEEE Transactions on Reliability.

[35]  Song Wang,et al.  Automatically Learning Semantic Features for Defect Prediction , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[36]  Carlos Guestrin,et al.  Model-Agnostic Interpretability of Machine Learning , 2016, ArXiv.

[37]  Ayse Basar Bener,et al.  On the relative value of cross-company and within-company data for defect prediction , 2009, Empirical Software Engineering.

[38]  Cem Ergün,et al.  Clustering Based Under-Sampling for Improving Speaker Verification Decisions Using AdaBoost , 2004, SSPR/SPR.

[39]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[40]  Xiaoyan Zhu,et al.  Does bug prediction support human developers? Findings from a Google case study , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[41]  Mohammad Alshayeb,et al.  Software defect prediction using ensemble learning on selected features , 2015, Inf. Softw. Technol..

[42]  Chakkrit Tantithamthavorn,et al.  JITLine: A Simpler, Better, Faster, Finer-grained Just-In-Time Defect Prediction , 2021, 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR).

[43]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[44]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[45]  Wushao Wen,et al.  Ridge and Lasso Regression Models for Cross-Version Defect Prediction , 2018, IEEE Transactions on Reliability.

[46]  Yutao Ma,et al.  A Top-k Learning to Rank Approach to Cross-Project Software Defect Prediction , 2018, 2018 25th Asia-Pacific Software Engineering Conference (APSEC).

[47]  Christoph Treude,et al.  AutoSpearman: Automatically Mitigating Correlated Software Metrics for Interpreting Defect Models , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[48]  Aleksandra Werner,et al.  The Proposal of Undersampling Method for Learning from Imbalanced Datasets , 2019, KES.

[49]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[50]  Premkumar T. Devanbu,et al.  How, and why, process metrics are better , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[51]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[52]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[53]  Tim Menzies,et al.  Software Analytics: So What? , 2013, IEEE Softw..

[54]  Sashank Dara,et al.  Online Defect Prediction for Imbalanced Data , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.