Design evolution metrics for defect prediction in object oriented systems

Testing is the most widely adopted practice to ensure software quality. However, this activity is often a compromise between the available resources and software quality. In object-oriented development, testing effort should be focused on defective classes. Unfortunately, identifying those classes is a challenging and difficult activity on which many metrics, techniques, and models have been tried. In this paper, we investigate the usefulness of elementary design evolution metrics to identify defective classes. The metrics include the numbers of added, deleted, and modified attributes, methods, and relations. The metrics are used to recommend a ranked list of classes likely to contain defects for a system. They are compared to Chidamber and Kemerer’s metrics on several versions of Rhino and of ArgoUML. Further comparison is conducted with the complexity metrics computed by Zimmermann et al. on several releases of Eclipse. The comparisons are made according to three criteria: presence of defects, number of defects, and defect density in the top-ranked classes. They show that the design evolution metrics, when used in conjunction with known metrics, improve the identification of defective classes. In addition, they show that the design evolution metrics make significantly better predictions of defect density than other metrics and, thus, can help in reducing the testing effort by focusing test activity on a reduced volume of code.

[1]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[2]  Premkumar T. Devanbu,et al.  Fair and balanced?: bias in bug-fix datasets , 2009, ESEC/FSE '09.

[3]  Fred W. Glover,et al.  Future paths for integer programming and links to artificial intelligence , 1986, Comput. Oper. Res..

[4]  Elaine J. Weyuker,et al.  Predicting the location and number of faults in large software systems , 2005, IEEE Transactions on Software Engineering.

[5]  Giuliano Antoniol,et al.  Error Correcting Graph Matching Application to Software Evolution , 2008, 2008 15th Working Conference on Reverse Engineering.

[6]  Lionel C. Briand,et al.  A Unified Framework for Cohesion Measurement , 1997, IEEE METRICS.

[7]  Sebastian G. Elbaum,et al.  Code churn: a measure for estimating the impact of code change , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[8]  Khaled El Emam,et al.  The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics , 2001, IEEE Trans. Software Eng..

[9]  Alfred V. Aho,et al.  Do Crosscutting Concerns Cause Defects? , 2008, IEEE Transactions on Software Engineering.

[10]  N. Nagappan,et al.  Use of relative code churn measures to predict system defect density , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[11]  Filippo Ricca,et al.  Analysis, testing and re-structuring of Web applications , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[12]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[13]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[14]  Lionel C. Briand,et al.  An Investigation of Graph-Based Class Integration Test Order Strategies , 2003, IEEE Trans. Software Eng..

[15]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[16]  Giuliano Antoniol,et al.  Evolution and Search Based Metrics to Improve Defects Prediction , 2009, 2009 1st International Symposium on Search Based Software Engineering.

[17]  Harvey P. Siy,et al.  Predicting Fault Incidence Using Software Change History , 2000, IEEE Trans. Software Eng..

[18]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[19]  King-Sun Fu,et al.  Error-Correcting Isomorphisms of Attributed Relational Graphs for Pattern Analysis , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[21]  D. Hosmer,et al.  Applied Logistic Regression , 1991 .

[22]  Victor R. Basili,et al.  A Validation of Object-Oriented Design Metrics as Quality Indicators , 1996, IEEE Trans. Software Eng..

[23]  Yann-Gaël Guéhéneuc,et al.  DeMIMA: A Multilayered Approach for Design Pattern Identification , 2008, IEEE Transactions on Software Engineering.

[24]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[25]  Tibor Gyimóthy,et al.  Empirical validation of object-oriented metrics on open source software for fault prediction , 2005, IEEE Transactions on Software Engineering.

[26]  Michelle Cartwright,et al.  An Empirical Investigation of an Object-Oriented Software System , 2000, IEEE Trans. Software Eng..

[27]  Giuliano Antoniol,et al.  Threats on building models from CVS and Bugzilla repositories: the Mozilla case study , 2007, CASCON.

[28]  William M. Evanco,et al.  Poisson analyses of defects for small software components , 1997, J. Syst. Softw..

[29]  Shari Lawrence Pfleeger,et al.  Software metrics (2nd ed.): a rigorous and practical approach , 1997 .

[30]  Jonas Mellin,et al.  On the Testing Maturity of Software Producing Organizations , 2006, Testing: Academic & Industrial Conference - Practice And Research Techniques (TAIC PART'06).

[31]  Stanley Lemeshow,et al.  Applied Logistic Regression, Second Edition , 1989 .

[32]  Lionel C. Briand,et al.  Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects , 2002, IEEE Trans. Software Eng..

[33]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[34]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.