Investigating the effect of "defect co-fix" on quality assurance resource allocation: A search-based approach

Introducing the concept of "remaining defects" in QA resource prioritization.Introducing "co-fix-awareness" to dynamically rank source code files.Proposing two co-fix-aware ranking algorithms.Empirically comparing several QA resource prioritization algorithms.Investigating the applicability of search algorithms on QA resource prioritization. Allocation of resources to pre-release quality assurance (QA) tasks, such as source code analysis, peer review, and testing, is one of the challenges faced by a software project manager. The goal is to find as many defects as possible with the available QA resources prior to the release. This can be achieved by assigning more resources to the more defect-prone artifacts, e.g., components, classes, and methods. The state-of-the-art QA resource allocation approaches predict the defect-proneness of an artifact using the historical data of different software metrics, e.g., the number of previous defects and the changes in the artifact. Given a QA budget, an allocation technique selects the most defect-prone artifacts, for further investigation by the QA team. While there has been many research efforts on discovering more predictive software metrics and more effective defect prediction algorithms, the cost-effectiveness of the QA resource allocation approaches has always been evaluated by counting the number of defects per selected artifact. The problem with such an evaluation approach is that it ignores the fact that, in practice, fixing a software issue is not bounded to an artifact under investigation. In other words, one may start reviewing a file that is identified as defect-prone and detect a defect, but to fix the defect one may modify not only the defective part of the file under review, but also several other artifacts that are somehow related to the defective code (e.g., a method that calls the defective code). Such co-fixes (fixing several defects together) during analyzing/reviewing/testing of an artifact under investigation will change the number of remaining defects in the other artifacts. Therefore, a QA resource allocation approach is more effective if it prioritizes the artifacts that would lead to the smallest number of remaining defects. Investigating six medium-to-large releases of open source systems (Mylyn, Eclipse, and NetBeans, two releases each), we found that co-fixes happen quite often in software projects (30-42% of the fixes modify more than one artifact). Therefore, in this paper, we first introduce a new cost-effectiveness measure to evaluate QA resource allocation, based on the concept of "remaining defects" per file. We then propose several co-fix-aware prioritization approaches to dynamically optimize the new measure, based on the historical defect co-fixes. The evaluation of these approaches on the six releases shows that (a) co-fix-aware QA prioritization approaches improve the traditional defect prediction-based ones, in terms of density of remaining defects per file and (b) co-fix-aware QA prioritization can potentially benefit from search-based software engineering techniques.

[1]  Tracy Hall,et al.  A Systematic Literature Review on Fault Prediction Performance in Software Engineering , 2012, IEEE Transactions on Software Engineering.

[2]  Akito Monden,et al.  Revisiting common bug prediction findings using effort-aware models , 2010, 2010 IEEE International Conference on Software Maintenance.

[3]  Trishul M. Chilimbi,et al.  HOLMES: Effective statistical debugging via efficient path profiling , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[4]  Yuanyuan Zhang,et al.  Search-based software engineering: Trends, techniques and applications , 2012, CSUR.

[5]  Nachiappan Nagappan,et al.  Predicting defects using network analysis on dependency graphs , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[6]  Audris Mockus,et al.  Software Dependencies, Work Dependencies, and Their Impact on Failures , 2009, IEEE Transactions on Software Engineering.

[7]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[8]  Mark Harman,et al.  The relationship between search based software engineering and predictive modeling , 2010, PROMISE '10.

[9]  Manishankar Mondal,et al.  Improving the detection accuracy of evolutionary coupling by measuring change correspondence , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[10]  Wasif Afzal,et al.  Search-Based Prediction of Fault Count Data , 2009, 2009 1st International Symposium on Search Based Software Engineering.

[11]  Michael W. Godfrey,et al.  The MSR Cookbook: Mining a decade of research , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[12]  Carl K. Chang,et al.  SPMNet: a formal methodology for software management , 1994, Proceedings Eighteenth Annual International Computer Software and Applications Conference (COMPSAC 94).

[13]  John D. Musa,et al.  Operational profiles in software-reliability engineering , 1993, IEEE Software.

[14]  A. Hassan,et al.  An Industrial Case Study of Customizing Operational Profiles Using Log Compression , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[15]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[16]  Lionel C. Briand,et al.  An enhanced test case selection approach for model-based testing: an industrial case study , 2010, FSE '10.

[17]  Gregg Rothermel,et al.  Test Case Prioritization: A Family of Empirical Studies , 2002, IEEE Trans. Software Eng..

[18]  Yasutaka Kamei,et al.  Is lines of code a good measure of effort in effort-aware models? , 2013, Inf. Softw. Technol..

[19]  Sooyong Park,et al.  Which Crashes Should I Fix First?: Predicting Top Crashes at an Early Stage to Prioritize Debugging Efforts , 2011, IEEE Transactions on Software Engineering.

[20]  Standard Glossary of Software Engineering Terminology , 1990 .

[21]  Wasif Afzal,et al.  Using Faults-Slip-Through Metric as a Predictor of Fault-Proneness , 2010, 2010 Asia Pacific Software Engineering Conference.

[22]  Michele Lanza,et al.  On the Relationship Between Change Coupling and Software Defects , 2009, 2009 16th Working Conference on Reverse Engineering.

[23]  Valery Buzungu,et al.  Predicting Fault-prone Components in a Java Legacy System , 2006 .

[24]  Rainer Koschke,et al.  Effort-Aware Defect Prediction Models , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.

[25]  Lionel C. Briand,et al.  Achieving scalable model-based testing through test case diversity , 2013, TSEM.

[26]  Andreas Zeller,et al.  Predicting vulnerable software components , 2007, CCS '07.

[27]  Ahmed E. Hassan,et al.  Static test case prioritization using topic models , 2014, Empirical Software Engineering.

[28]  Andreas Zeller,et al.  The impact of tangled code changes , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[29]  Rudolf Ferenc,et al.  Using the Conceptual Cohesion of Classes for Fault Prediction in Object-Oriented Systems , 2008, IEEE Transactions on Software Engineering.

[30]  Audris Mockus,et al.  High-impact defects: a study of breakage and surprise defects , 2011, ESEC/FSE '11.

[31]  Wasif Afzal,et al.  Search-based Resource Scheduling for Bug Fixing Tasks , 2010, 2nd International Symposium on Search Based Software Engineering.

[32]  Xiaojin Zhu,et al.  Statistical Debugging Using Latent Topic Models , 2007, ECML.

[33]  Mark Harman,et al.  Not going to take this anymore: Multi-objective overtime planning for Software Engineering projects , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[34]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[35]  Norman E. Fenton,et al.  A Critique of Software Defect Prediction Models , 1999, IEEE Trans. Software Eng..

[36]  Kesheng Wu,et al.  Efficiently Extracting Operational Profiles from Execution Logs Using Suffix Arrays , 2009, 2009 20th International Symposium on Software Reliability Engineering.

[37]  Mark Harman,et al.  Search Algorithms for Regression Test Case Prioritization , 2007, IEEE Transactions on Software Engineering.

[38]  Andreas Zeller,et al.  When do changes induce fixes? , 2005, ACM SIGSOFT Softw. Eng. Notes.

[39]  Michele Lanza,et al.  Evaluating defect prediction approaches: a benchmark and an extensive comparison , 2011, Empirical Software Engineering.

[40]  Gerardo Canfora,et al.  Multi-objective Cross-Project Defect Prediction , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[41]  Rainer Koschke,et al.  Revisiting the evaluation of defect prediction models , 2009, PROMISE '09.

[42]  D MusaJohn Operational Profiles in Software-Reliability Engineering , 1993 .

[43]  Richard C. Holt,et al.  The top ten list: dynamic fault prediction , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[44]  Ahmed E. Hassan,et al.  Understanding the impact of code and process metrics on post-release defects: a case study on the Eclipse project , 2010, ESEM '10.