Evaluating Software Metrics for Sorting Software Modules in Order of Defect Count

Sorting software modules in order of defect count can help testers to focus on software modules with more defects. Many approaches have been proposed to accomplish this. In order to compare approaches more fairly, researchers have provided publicly available data sets. In this paper, we provide a new metric selection approach and evaluate the usefulness of software metrics of eleven publicly available data sets, in order to investigate the quality of these data sets and find out the software metrics that are most efficient for sorting modules in order of defect count. Unexpectedly, experimental results show that only one metric can work well over most of these data sets, which implies that more effective metrics should be introduced. We also obtain other findings from these data sets, which can help to introduce new metrics for sorting software modules in order of defect count to some extent.

[1]  Bruce Christianson,et al.  The misuse of the NASA metrics data program data sets for automated software defect prediction , 2011, EASE.

[2]  Xiang Chen,et al.  FECAR: A Feature Selection Framework for Software Defect Prediction , 2014, 2014 IEEE 38th Annual Computer Software and Applications Conference.

[3]  Anas N. Al-Rabadi,et al.  A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[4]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[5]  Niclas Ohlsson,et al.  Predicting Fault-Prone Software Modules in Telephone Switches , 1996, IEEE Trans. Software Eng..

[6]  Jaechang Nam,et al.  CLAMI: Defect Prediction on Unlabeled Datasets , 2015, ASE 2015.

[7]  Taghi M. Khoshgoftaar,et al.  An Empirical Study of Feature Ranking Techniques for Software Quality Prediction , 2012, Int. J. Softw. Eng. Knowl. Eng..

[8]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[9]  Sunghun Kim,et al.  Reducing Features to Improve Code Change-Based Bug Prediction , 2013, IEEE Transactions on Software Engineering.

[10]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[11]  Tim Menzies,et al.  Data Mining Static Code Attributes to Learn Defect Predictors , 2007, IEEE Transactions on Software Engineering.

[12]  Xin Yao,et al.  A Learning-to-Rank Approach to Software Defect Prediction , 2015, IEEE Transactions on Reliability.

[13]  Taghi M. Khoshgoftaar,et al.  Ordering Fault-Prone Software Modules , 2003, Software Quality Journal.

[14]  Premkumar T. Devanbu,et al.  How, and why, process metrics are better , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[15]  Michele Lanza,et al.  An extensive comparison of bug prediction approaches , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[16]  M. Fay,et al.  Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. , 2010, Statistics surveys.

[17]  Taghi M. Khoshgoftaar,et al.  A Comprehensive Empirical Study of Count Models for Software Fault Prediction , 2007, IEEE Transactions on Reliability.

[18]  Fumio Akiyama,et al.  An Example of Software System Debugging , 1971, IFIP Congress.

[19]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[20]  Taghi M. Khoshgoftaar,et al.  Metric Selection for Software Defect Prediction , 2011, Int. J. Softw. Eng. Knowl. Eng..

[21]  Taghi M. Khoshgoftaar,et al.  A Comparative Study of Ordering and Classification of Fault-Prone Software Modules , 1999, Empirical Software Engineering.

[22]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[23]  Amri Napolitano,et al.  A comparative study of iterative and non-iterative feature selection techniques for software defect prediction , 2013, Information Systems Frontiers.

[24]  Maurice H. Halstead,et al.  Elements of software science , 1977 .

[25]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[26]  Elaine J. Weyuker,et al.  Comparing the effectiveness of several modeling methods for fault prediction , 2010, Empirical Software Engineering.

[27]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[28]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[29]  Taghi M. Khoshgoftaar,et al.  How Many Software Metrics Should be Selected for Defect Prediction? , 2011, FLAIRS.

[30]  Neville Churcher,et al.  Comments on "A Metrics Suite for Object Oriented Design" , 1995, IEEE Trans. Software Eng..

[31]  Claes Wohlin,et al.  Empirical evidence on the link between object-oriented measures and external quality attributes: a systematic literature review , 2013, Empirical Software Engineering.

[32]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[33]  Qinbao Song,et al.  Data Quality: Some Comments on the NASA Software Defect Datasets , 2013, IEEE Transactions on Software Engineering.

[34]  Banu Diri,et al.  A systematic review of software fault prediction studies , 2009, Expert Syst. Appl..