File-Level Defect Prediction: Unsupervised vs. Supervised Models

Background: Software defect models can help software quality assurance teams to allocate testing or code review resources. A variety of techniques have been used to build defect prediction models, including supervised and unsupervised methods. Recently, Yang et al. [1] surprisingly find that unsupervised models can perform statistically significantly better than supervised models in effort-aware change-level defect prediction. However, little is known about relative performance of unsupervised and supervised models for effort-aware file-level defect prediction. Goal: Inspired by their work, we aim to investigate whether a similar finding holds in effort-aware file-level defect prediction. Method: We replicate Yang et al.'s study on PROMISE dataset with totally ten projects. We compare the effectiveness of unsupervised and supervised prediction models for effort-aware file-level defect prediction. Results: We find that the conclusion of Yang et al. [1] does not hold under within-project but holds under cross-project setting for file-level defect prediction. In addition, following the recommendations given by the best unsupervised model, developers needs to inspect statistically significantly more files than that of supervised models considering the same inspection effort (i.e., LOC). Conclusions: (a) Unsupervised models do not perform statistically significantly better than state-of-art supervised model under within-project setting, (b) Unsupervised models can perform statistically significantly better than state-ofart supervised model under cross-project setting, (c) We suggest that not only LOC but also number of files needed to be inspected should be considered when evaluating effort-aware filelevel defect prediction models.

[1]  Koichiro Ochimizu,et al.  Towards logistic regression models for predicting fault-prone code across software projects , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[2]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[3]  Rainer Koschke,et al.  Effort-Aware Defect Prediction Models , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.

[4]  Shane McIntosh,et al.  Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[5]  Barry W. Boehm,et al.  What we have learned about fighting defects , 2002, Proceedings Eighth IEEE Symposium on Software Metrics.

[6]  Xinli Yang,et al.  Deep Learning for Just-in-Time Defect Prediction , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[7]  Xinli Yang,et al.  TLEL: A two-layer ensemble learning approach for just-in-time defect prediction , 2017, Inf. Softw. Technol..

[8]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[9]  Taghi M. Khoshgoftaar,et al.  Unsupervised learning for expert-based software quality estimation , 2004, Eighth IEEE International Symposium on High Assurance Systems Engineering, 2004. Proceedings..

[10]  Ayse Basar Bener,et al.  Defect prediction from static code features: current results, limitations, new approaches , 2010, Automated Software Engineering.

[11]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[12]  Qian Yin,et al.  Software quality prediction using Affinity Propagation algorithm , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[13]  Hoh Peter In,et al.  Micro interaction metrics for defect prediction , 2011, ESEC/FSE '11.

[14]  Gregory Tassey,et al.  Prepared for what , 2007 .

[15]  Hongfang Liu,et al.  Theory of relative defect proneness , 2008, Empirical Software Engineering.

[16]  Ying Zou,et al.  Cross-Project Defect Prediction Using a Connectivity-Based Unsupervised Classifier , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[17]  Lech Madeyski,et al.  Towards identifying software project clusters with regard to defect prediction , 2010, PROMISE '10.

[18]  David Lo,et al.  Collective Personalized Change Classification With Multiobjective Search , 2016, IEEE Transactions on Reliability.

[19]  Michael E. Fagan Design and Code Inspections to Reduce Errors in Program Development , 1976, IBM Syst. J..

[20]  Bojan Cukic,et al.  An adaptive approach with active learning in software fault prediction , 2012, PROMISE '12.

[21]  Claes Wohlin,et al.  Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering , 2006 .

[22]  Ömer Faruk Arar,et al.  Software defect prediction using cost-sensitive neural network , 2015, Appl. Soft Comput..

[23]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[24]  David Lo,et al.  Supervised vs Unsupervised Models: A Holistic Look at Effort-Aware Just-in-Time Defect Prediction , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[25]  Ayse Basar Bener,et al.  Software Defect Identification Using Machine Learning Techniques , 2006, 32nd EUROMICRO Conference on Software Engineering and Advanced Applications (EUROMICRO'06).

[26]  Vandana Bhattacherjee,et al.  Software Fault Prediction Using Quad Tree-Based K-Means Clustering Algorithm , 2012, IEEE Transactions on Knowledge and Data Engineering.

[27]  Enio G. Jelihovschi,et al.  ScottKnott: A Package for Performing the Scott-Knott Clustering Algorithm in R , 2014 .

[28]  David Lo,et al.  An Empirical Study of Classifier Combination for Cross-Project Defect Prediction , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[29]  Ayse Basar Bener,et al.  On the relative value of cross-company and within-company data for defect prediction , 2009, Empirical Software Engineering.

[30]  Taghi M. Khoshgoftaar,et al.  Ordering Fault-Prone Software Modules , 2003, Software Quality Journal.

[31]  Lionel C. Briand,et al.  Predicting fault-prone components in a java legacy system , 2006, ISESE '06.

[32]  Zhi-Hua Zhou,et al.  Sample-based software defect prediction with active and semi-supervised learning , 2012, Automated Software Engineering.

[33]  Sinno Jialin Pan,et al.  Transfer defect learning , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[34]  Tim Menzies,et al.  Heterogeneous Defect Prediction , 2015, IEEE Transactions on Software Engineering.

[35]  Jaechang Nam,et al.  CLAMI: Defect Prediction on Unlabeled Datasets , 2015, ASE 2015.

[36]  Akito Monden,et al.  Revisiting common bug prediction findings using effort-aware models , 2010, 2010 IEEE International Conference on Software Maintenance.

[37]  Yuming Zhou,et al.  Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models , 2016, SIGSOFT FSE.

[38]  Harald C. Gall,et al.  Cross-project defect prediction: a large scale experiment on data vs. domain vs. process , 2009, ESEC/SIGSOFT FSE.

[39]  Audris Mockus,et al.  A large-scale empirical study of just-in-time quality assurance , 2013, IEEE Transactions on Software Engineering.

[40]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[41]  Jaechang Nam,et al.  CLAMI: Defect Prediction on Unlabeled Datasets (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[42]  Ayse Basar Bener,et al.  Practical considerations in deploying statistical methods for defect prediction: A case study within the Turkish telecommunications industry , 2010, Inf. Softw. Technol..

[43]  Tim Menzies,et al.  Revisiting unsupervised learning for defect prediction , 2017, ESEC/SIGSOFT FSE.

[44]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[45]  Ying Zou,et al.  Local versus global models for effort-aware defect prediction , 2016, CASCON.

[46]  David Lo,et al.  HYDRA: Massively Compositional Model for Cross-Project Defect Prediction , 2016, IEEE Transactions on Software Engineering.

[47]  Hongfang Liu,et al.  Testing the theory of relative defect proneness for closed-source software , 2010, Empirical Software Engineering.

[48]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[49]  Rainer Koschke,et al.  Revisiting the evaluation of defect prediction models , 2009, PROMISE '09.

[50]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[51]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[52]  J. A. Ferreira,et al.  On the Benjamini-Hochberg method , 2006, math/0611265.

[53]  Yasutaka Kamei,et al.  Is lines of code a good measure of effort in effort-aware models? , 2013, Inf. Softw. Technol..