Reflection-aware static regression test selection

Regression test selection (RTS) aims to speed up regression testing by rerunning only tests that are affected by code changes. RTS can be performed using static or dynamic analysis techniques. Our prior study showed that static and dynamic RTS perform similarly for medium-sized Java projects. However, the results of that prior study also showed that static RTS can be unsafe, missing to select tests that dynamic RTS selects, and that reflection was the only cause of unsafety observed among the evaluated projects. In this paper, we investigate five techniques—three purely static techniques and two hybrid static-dynamic techniques—that aim to make static RTS safe with respect to reflection. We implement these reflection-aware (RA) techniques by extending the reflection-unaware (RU) class-level static RTS technique in a tool called STARTS. To evaluate these RA techniques, we compare their end-to-end times with RU, and with RetestAll, which reruns all tests after every code change. We also compare safety and precision of the RA techniques with Ekstazi, a state-of-the-art dynamic RTS technique; precision is a measure of unaffected tests selected. Our evaluation on 1173 versions of 24 open-source Java projects shows negative results. The RA techniques improve the safety of RU but at very high costs. The purely static techniques are safe in our experiments but decrease the precision of RU, with end-to-end time at best 85.8% of RetestAll time, versus 69.1% for RU. One hybrid static-dynamic technique improves the safety of RU but at high cost, with end-to-end time that is 91.2% of RetestAll. The other hybrid static-dynamic technique provides better precision, is safer than RU, and incurs lower end-to-end time—75.8% of RetestAll, but it can still be unsafe in the presence of test-order dependencies. Our study highlights the challenges involved in making static RTS safe with respect to reflection.

[1]  Linda Badri,et al.  Supporting predictive change impact analysis: a control call graph based technique , 2005, 12th Asia-Pacific Software Engineering Conference (APSEC'05).

[2]  John Micco,et al.  Taming Google-Scale Continuous Testing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[3]  Srikanth Kandula,et al.  CloudBuild: Microsoft's Distributed and Caching Build Service , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[4]  David Chenho Kung,et al.  Class Firewall, Test Order, and Regression Testing of Object-Oriented Programs , 1995, J. Object Oriented Program..

[5]  Yi Zhang,et al.  Techniques for Evolution-Aware Runtime Verification , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).

[6]  Gregg Rothermel,et al.  Techniques for improving regression testing in continuous integration development environments , 2014, SIGSOFT FSE.

[7]  Darko Marinov,et al.  Ekstazi: Lightweight Test Selection , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[8]  Jingling Xue,et al.  Self-inferencing Reflection Resolution for Java , 2014, ECOOP.

[9]  Alexander Serebrenik,et al.  Challenges for Static Analysis of Java Reflection - Literature Review and Empirical Study , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[10]  Amitabh Srivastava,et al.  Effectively prioritizing tests in development environment , 2002, ISSTA '02.

[11]  Sarfraz Khurshid,et al.  Localizing failure-inducing program edits based on spectrum information , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[12]  Michael D. Ernst,et al.  Empirically revisiting the test independence assumption , 2014, ISSTA 2014.

[13]  Wing Lam,et al.  iDFlakies: A Framework for Detecting and Partially Classifying Flaky Tests , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).

[14]  Jingling Xue,et al.  Effective Soundness-Guided Reflection Analysis , 2015, SAS.

[15]  Darko Marinov,et al.  Practical regression test selection with dynamic file dependencies , 2015, ISSTA.

[16]  Darko Marinov,et al.  Evaluating Regression Test Selection Opportunities in a Very Large Open-Source Ecosystem , 2018, 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE).

[17]  Jacques Klein,et al.  Reflection-aware static analysis of Android apps , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[18]  Marcelo d'Amorim,et al.  Static Analysis of Implicit Control Flow: Resolving Java Reflection and Android Intents (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[19]  Hareton Leung,et al.  A study of integration testing and software regression at the integration level , 1990, Proceedings. Conference on Software Maintenance 1990.

[20]  Lingming Zhang,et al.  Speeding up Mutation Testing via Regression Test Selection: An Extensive Study , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).

[21]  Christian Kirkegaard,et al.  Static analysis of XML transformations in Java , 2003, IEEE Transactions on Software Engineering.

[22]  Henrik Karlsson Limiting Transitive Closure for Static Regression Test Selection approaches , 2019 .

[23]  Satish Chandra,et al.  Predictive Test Selection , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[24]  Arie van Deursen,et al.  Automated Detection of Test Fixture Strategies and Smells , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation.

[25]  Erik Lundsten EALRTS : A predictive regression test selection tool , 2019 .

[26]  Görel Hedin,et al.  Extraction-Based Regression Test Selection , 2016, PPPJ '16.

[27]  Andreas Zeller,et al.  Practical Test Dependency Detection , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).

[28]  Ondrej Lhoták,et al.  In defense of soundiness , 2015, Commun. ACM.

[29]  Jacques Klein,et al.  DroidRA: taming reflection to support whole-program analysis of Android apps , 2016, ISSTA.

[30]  Michael D. Ernst,et al.  When Tests Collide: Evaluating and Coping with the Impact of Test Dependence , 2015 .

[31]  Gabriele Bavota,et al.  When and Why Your Code Starts to Smell Bad (and Whether the Smells Go Away) , 2015, IEEE Transactions on Software Engineering.

[32]  Yannis Smaragdakis,et al.  More Sound Static Handling of Java Reflection , 2015, APLAS.

[33]  Tao Xie,et al.  iFixFlakies: a framework for automatically fixing order-dependent flaky tests , 2019, ESEC/SIGSOFT FSE.

[34]  Alessandro Orso,et al.  Scaling regression testing to large software systems , 2004, SIGSOFT '04/FSE-12.

[35]  Mira Mezini,et al.  Taming reflection: Aiding static analysis in the presence of reflection and custom class loaders , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[36]  Gail E. Kaiser,et al.  Unit test virtualization with VMVM , 2014, ICSE.

[37]  Gregg Rothermel,et al.  A safe, efficient algorithm for regression test selection , 1993, 1993 Conference on Software Maintenance.

[38]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[39]  Gregg Rothermel,et al.  The Effect of Test Suite Type on Regression Test Selection , 2016, 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE).

[40]  Lingming Zhang,et al.  Hybrid Regression Test Selection , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[41]  Yingjun Lyu,et al.  String analysis for Java and Android applications , 2015, ESEC/SIGSOFT FSE.

[42]  Ahmet Çelik,et al.  Regression test selection for TizenRT , 2018, ESEC/SIGSOFT FSE.

[43]  Frank Tip,et al.  Chianti: A Prototype Change Impact Analysis Tool for Java , 2003 .

[44]  David S. Rosenblum,et al.  TESTTUBE: a system for selective regression testing , 1994, Proceedings of 16th International Conference on Software Engineering.

[45]  Darko Marinov,et al.  STARTS: STAtic regression test selection , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[46]  Ahmet Çelik,et al.  Regression test selection across JVM boundaries , 2017, ESEC/SIGSOFT FSE.

[47]  Darko Marinov,et al.  An extensive study of static regression test selection in modern software evolution , 2016, SIGSOFT FSE.

[48]  Yannis Smaragdakis,et al.  Efficient Reflection String Analysis via Graph Coloring , 2018, ECOOP.

[49]  Silva Filho,et al.  Static analysis of implicit control flow: resolving Java reflection and Android intents , 2016 .

[50]  Shigeru Chiba,et al.  Load-Time Structural Reflection in Java , 2000, ECOOP.

[51]  Grigore Rosu,et al.  Evolution-Aware Monitoring-Oriented Programming , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[52]  Andy Zaidman,et al.  Does Refactoring of Test Smells Induce Fixing Flaky Tests? , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[53]  Andy Zaidman,et al.  On the Relation of Test Smells to Software Code Quality , 2018, 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[54]  ShiAugust,et al.  Reflection-aware static regression test selection , 2019 .

[55]  José de Oliveira Guimarães,et al.  Reflection for Statically Typed Languages , 1998, ECOOP.

[56]  Chen Huo,et al.  Improving oracle quality by detecting brittle assertions and unused inputs in tests , 2014, FSE 2014.

[57]  Alessandro Orso,et al.  Regression test selection for Java software , 2001, OOPSLA '01.

[58]  Aske Simon Christensen,et al.  Precise Analysis of String Expressions , 2003, SAS.

[59]  Chenguang Zhu,et al.  A Framework for Checking Regression Test Selection Tools , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[60]  Frank Tip,et al.  Chianti: a tool for change impact analysis of java programs , 2004, OOPSLA.

[61]  Darko Marinov,et al.  Comparing and combining test-suite reduction and regression test selection , 2015, ESEC/SIGSOFT FSE.

[62]  Ahmet Çelik,et al.  Towards Refactoring-Aware Regression Test Selection , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[63]  Nachiappan Nagappan,et al.  Empirically Detecting False Test Alarms Using Association Rules , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[64]  Eric Bodden,et al.  RefaFlex: safer refactorings for reflective Java programs , 2012, ISSTA 2012.

[65]  Gail E. Kaiser,et al.  Efficient dependency detection for safe Java test acceleration , 2015, ESEC/SIGSOFT FSE.

[66]  Darko Marinov,et al.  Reliable testing: detecting state-polluting tests to prevent test dependency , 2015, ISSTA.

[67]  Gregg Rothermel,et al.  A safe, efficient regression test selection technique , 1997, TSEM.