Test coverage of impacted code elements for detecting refactoring faults: An exploratory study

Abstract Refactoring validation by testing is critical for quality in agile development. However, this activity may be misleading when a test suite is insufficiently robust for revealing faults. Particularly, refactoring faults can be tricky and difficult to detect. Coverage analysis is a standard practice to evaluate fault detection capability of test suites. However, there is usually a low correlation between coverage and fault detection. In this paper, we present an exploratory study on the use of coverage data of mostly impacted code elements to identify shortcomings in a test suite. We consider three real open source projects and their original test suites. The results show that a test suite not directly calling the refactored method and/or its callers increases the chance of missing the fault. Additional analysis of branch coverage on test cases shows that there are higher chances of detecting a refactoring fault when branch coverage is high. These results give evidence that a combination of impact analysis with branch coverage could be highly effective in detecting faults introduced by refactoring edits. Furthermore, we propose a statistic model that evidences the correlation of coverage over certain code elements and the suite’s capability of revealing refactoring faults.

[1]  Tom Mens,et al.  A survey of software refactoring , 2004, IEEE Transactions on Software Engineering.

[2]  Phyllis G. Frankl,et al.  An experimental comparison of the effectiveness of the all-uses and all-edges adequacy criteria , 1991, TAV4.

[3]  Ralph E. Johnson,et al.  Drag-and-drop refactoring: Intuitive and efficient program transformation , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[4]  Michael R. Lyu,et al.  The effect of code coverage on fault detection under different testing profiles , 2005, A-MOST.

[5]  Lionel C. Briand,et al.  Is mutation an appropriate tool for testing experiments? , 2005, ICSE.

[6]  Mik Kersten,et al.  How are Java software developers using the Elipse IDE? , 2006, IEEE Software.

[7]  Phyllis G. Frankl,et al.  Further empirical studies of test effectiveness , 1998, SIGSOFT '98/FSE-6.

[8]  Frank Tip,et al.  Change impact analysis for object-oriented programs , 2001, PASTE '01.

[9]  Angelo Gargantini,et al.  Extending Coverage Criteria by Evaluating Their Robustness to Code Structure Changes , 2012, ICTSS.

[10]  James H. Andrews,et al.  General Test Result Checking with Log File Analysis , 2003, IEEE Trans. Software Eng..

[11]  Atif M. Memon,et al.  What test oracle should I use for effective GUI testing? , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[12]  Miryung Kim,et al.  An empirical investigation into the role of API-level refactorings during software evolution , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[13]  Stephan Diehl,et al.  Are refactorings less error-prone than other changes? , 2006, MSR '06.

[14]  Gabriele Bavota,et al.  When Does a Refactoring Induce Bugs? An Empirical Study , 2012, 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation.

[15]  Darko Marinov,et al.  Automated testing of refactoring engines , 2007, ESEC-FSE '07.

[16]  Michael R. Lyu,et al.  The effect of code coverage on fault detection under different testing profiles , 2005, ACM SIGSOFT Softw. Eng. Notes.

[17]  John A. Clark,et al.  Investigating the effectiveness of object‐oriented testing strategies using the mutation method , 2001 .

[18]  Sarfraz Khurshid,et al.  Localizing failure-inducing program edits based on spectrum information , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[19]  Rohit Gheyi,et al.  Analyzing Refactorings on Software Repositories , 2011, 2011 25th Brazilian Symposium on Software Engineering.

[20]  Miryung Kim,et al.  A field study of refactoring challenges and benefits , 2012, SIGSOFT FSE.

[21]  J.H. Andrews,et al.  Is mutation an appropriate tool for testing experiments? [software testing] , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[22]  Lionel C. Briand,et al.  Using simulation to empirically investigate test coverage criteria based on statechart , 2004, Proceedings. 26th International Conference on Software Engineering.

[23]  Patrícia Duarte de Lima Machado,et al.  Test coverage and impact analysis for detecting refactoring faults: a study on the extract method refactoring , 2015, SAC.

[24]  Michael D. Ernst,et al.  Are mutants a valid substitute for real faults in software testing? , 2014, SIGSOFT FSE.

[25]  Jeremy Miles,et al.  Discovering statistics using R, 1st Edition , 2012 .

[26]  Ralph E. Johnson,et al.  The role of refactorings in API evolution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[27]  Stas Negara,et al.  A Comparative Study of Manual and Automated Refactorings , 2013, ECOOP.

[28]  Rohit Gheyi,et al.  Making Program Refactoring Safer , 2010, IEEE Software.

[29]  Alex Groce,et al.  Comparing non-adequate test suites using coverage criteria , 2013, ISSTA.

[30]  Emerson R. Murphy-Hill,et al.  Manual refactoring changes with automated refactoring validation , 2014, ICSE.

[31]  Witold Pedrycz,et al.  A Case Study on the Impact of Refactoring on Quality and Productivity in an Agile Team , 2008, CEE-SET.

[32]  Andrew P. Black,et al.  How We Refactor, and How We Know It , 2012, IEEE Trans. Software Eng..

[33]  Phyllis G. Frankl,et al.  All-uses vs mutation testing: An experimental comparison of effectiveness , 1997, J. Syst. Softw..

[34]  Augusto Sampaio,et al.  Sound refactorings , 2010, Science of Computer Programming.

[35]  Dietmar Pfahl,et al.  Using simulation for assessing the real impact of test coverage on defect coverage , 1999, Proceedings 10th International Symposium on Software Reliability Engineering (Cat. No.PR00443).

[36]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .

[37]  Eleni Stroulia,et al.  UMLDiff: an algorithm for object-oriented design differencing , 2005, ASE.

[38]  Frank Tip,et al.  Chianti: a tool for change impact analysis of java programs , 2004, OOPSLA.

[39]  Reid Holmes,et al.  Coverage is not strongly correlated with test suite effectiveness , 2014, ICSE.

[40]  Gustavo Soares Soares Automated behavioral testing of refactoring engines , 2012, SPLASH '12.

[41]  John A. Clark,et al.  Investigating the effectiveness of object‐oriented testing strategies using the mutation method , 2001, Softw. Test. Verification Reliab..

[42]  Joseph Robert Horgan,et al.  Effect of test set size and block coverage on the fault detection effectiveness , 1994, Proceedings of 1994 IEEE International Symposium on Software Reliability Engineering.

[43]  Miryung Kim,et al.  RefDistiller: a refactoring aware code review tool for inspecting manual refactoring edits , 2014, SIGSOFT FSE.

[44]  Gustavo Soares,et al.  Automated Behavioral Testing of Refactoring Engines , 2013, IEEE Transactions on Software Engineering.

[45]  Patrícia Duarte de Lima Machado,et al.  A refactoring-based approach for test case selection and prioritization , 2013, 2013 8th International Workshop on Automation of Software Test (AST).