Regression test selection across JVM boundaries

Modern software development processes recommend that changes be integrated into the main development line of a project multiple times a day. Before a new revision may be integrated, developers practice regression testing to ensure that the latest changes do not break any previously established functionality. The cost of regression testing is high, due to an increase in the number of revisions that are introduced per day, as well as the number of tests developers write per revision. Regression test selection (RTS) optimizes regression testing by skipping tests that are not affected by recent project changes. Existing dynamic RTS techniques support only projects written in a single programming language, which is unfortunate knowing that an open-source project is on average written in several programming languages. We present the first dynamic RTS technique that does not stop at predefined language boundaries. Our technique dynamically detects, at the operating system level, all file artifacts a test depends on. Our technique is, hence, oblivious to the specific means the test uses to actually access the files: be it through spawning a new process, invoking a system call, invoking a library written in a different language, invoking a library that spawns a process which makes a system call, etc. We also provide a set of extension points which allow for a smooth integration with testing frameworks and build systems. We implemented our technique in a tool called RTSLinux as a loadable Linux kernel module and evaluated it on 21 Java projects that escape JVM by spawning new processes or invoking native code, totaling 2,050,791 lines of code. Our results show that RTSLinux, on average, skips 74.17% of tests and saves 52.83% of test execution time compared to executing all tests.

[1]  Ying Zou,et al.  An empirical study of build system migrations in practice: Case studies on KDE and the Linux kernel , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[2]  Gail E. Kaiser,et al.  Unit test virtualization with VMVM , 2014, ICSE.

[3]  Peter Smith Software Build Systems: Principles and Experience , 2011 .

[4]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[5]  Margo I. Seltzer,et al.  A General-Purpose Provenance Library , 2012, TaPP.

[6]  Milos Gligoric,et al.  File-level vs. module-level regression test selection for .NET , 2017, ESEC/SIGSOFT FSE.

[7]  Erez Zadok,et al.  Story Book: An Efficient Extensible Provenance Framework , 2009, Workshop on the Theory and Practice of Provenance.

[8]  Margo I. Seltzer,et al.  Layering in Provenance Systems , 2009, USENIX Annual Technical Conference.

[9]  Gregg Rothermel,et al.  A safe, efficient regression test selection technique , 1997, TSEM.

[10]  Jörg Zinke System call tracing overhead , 2009 .

[11]  Darko Marinov,et al.  Usage, costs, and benefits of continuous integration in open-source projects , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[12]  Darko Marinov,et al.  An empirical analysis of flaky tests , 2014, SIGSOFT FSE.

[13]  J. David Morgenthaler,et al.  Automated Decomposition of Build Targets , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[14]  Darko Marinov,et al.  Reliable testing: detecting state-polluting tests to prevent test dependency , 2015, ISSTA.

[15]  Benjamin Livshits,et al.  Automated migration of build scripts using dynamic analysis and search-based refactoring , 2014, OOPSLA.

[16]  Alessandro Orso,et al.  Regression testing in the presence of non-code changes , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[17]  Spiros Mancoridis,et al.  On the automatic modularization of software systems using the Bunch tool , 2006, IEEE Transactions on Software Engineering.

[18]  Gregg Rothermel,et al.  Analyzing Regression Test Selection Techniques , 1996, IEEE Trans. Software Eng..

[19]  Darko Marinov,et al.  An extensive study of static regression test selection in modern software evolution , 2016, SIGSOFT FSE.

[20]  Swarnendu Biswas,et al.  Regression Test Selection Techniques: A Survey , 2011, Informatica.

[21]  Per Runeson,et al.  A systematic review on regression test selection techniques , 2010, Inf. Softw. Technol..

[22]  Suzanne M. Embury,et al.  A safe regression test selection technique for database-driven applications , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[23]  Sheng Liang,et al.  Java Native Interface: Programmer's Guide and Reference , 1999 .

[24]  Gregg Rothermel,et al.  Software testing: a research travelogue (2000–2014) , 2014, FOSE.

[25]  Frank Tip,et al.  Chianti: a tool for change impact analysis of java programs , 2004, OOPSLA.

[26]  Ashish Gehani,et al.  SPADE: Support for Provenance Auditing in Distributed Environments , 2012, Middleware.

[27]  Michael D. Ernst,et al.  Empirically revisiting the test independence assumption , 2014, ISSTA 2014.

[28]  Ahmet Çelik,et al.  Build system with lazy retrieval for Java projects , 2016, SIGSOFT FSE.

[29]  Sheng Liang,et al.  Java Native Interface: Programmer's Guide and Specification , 1999 .

[30]  Gregg Rothermel,et al.  Techniques for improving regression testing in continuous integration development environments , 2014, SIGSOFT FSE.

[31]  Amitabh Srivastava,et al.  Effectively prioritizing tests in development environment , 2002, ISSTA '02.

[32]  Per Runeson,et al.  A Qualitative Survey of Regression Testing Practices , 2010, PROFES.

[33]  Allan Heydon,et al.  The Vesta Software Configuration Management System , 2002 .

[34]  Per Runeson,et al.  Empirical evaluations of regression test selection techniques: a systematic review , 2008, ESEM '08.

[35]  Milos Gligoric,et al.  Regression test selection: Theory and practice , 2015 .

[36]  Sigrid Eldh Software Testing Techniques , 2007 .

[37]  Margo I. Seltzer,et al.  StarFlow: A Script-Centric Data Analysis Environment , 2010, IPAW.

[38]  Darko Marinov,et al.  Balancing trade-offs in test-suite reduction , 2014, SIGSOFT FSE.

[39]  Keng Siau,et al.  Advanced Topics In Database Research , 2005 .

[40]  Thomas Ball,et al.  On the limit of control flow analysis for regression test selection , 1998, ISSTA '98.

[41]  Darko Marinov,et al.  Practical regression test selection with dynamic file dependencies , 2015, ISSTA.

[42]  Sarfraz Khurshid,et al.  Localizing failure-inducing program edits based on spectrum information , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[43]  K. Rustan M. Leino,et al.  Formalizing and Verifying a Modern Build Language , 2014, FM.

[44]  Kivanç Muslu,et al.  Finding bugs by isolating unit tests , 2011, ESEC/FSE '11.

[45]  Per Runeson,et al.  A case study of the class firewall regression test selection technique on a large scale distributed software system , 2005, 2005 International Symposium on Empirical Software Engineering, 2005..

[46]  Xiangyu Zhang,et al.  High Accuracy Attack Provenance via Binary-based Execution Partition , 2013, NDSS.

[47]  Myra B. Cohen,et al.  Automated testing of GUI applications: Models, tools, and controlling flakiness , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[48]  Alessandro Orso,et al.  Scaling regression testing to large software systems , 2004, SIGSOFT '04/FSE-12.

[49]  David S. Rosenblum,et al.  TESTTUBE: a system for selective regression testing , 1994, Proceedings of 16th International Conference on Software Engineering.

[50]  Ramzi A. Haraty,et al.  Regression Test Selection for Database Applications , 2004, Advanced Topics in Database Research, Vol. 3.

[51]  Philip J. Guo,et al.  Using automatic persistent memoization to facilitate data analysis scripting , 2011, ISSTA '11.

[52]  Philip Mayer,et al.  An empirical analysis of the utilization of multiple programming languages in open source projects , 2015, EASE.

[53]  Ramzi A. Haraty,et al.  Regression testing of database applications , 2001, SAC.

[54]  Mary Jean Harrold,et al.  Re-computing Coverage Information to Assist Regression Testing , 2007, 2007 IEEE International Conference on Software Maintenance.

[55]  Shane McIntosh,et al.  The evolution of Java build systems , 2012, Empirical Software Engineering.

[56]  Per Runeson,et al.  Improving Class Firewall Regression Test Selection by Removing the Class Firewall , 2007, Int. J. Softw. Eng. Knowl. Eng..

[57]  Srikanth Kandula,et al.  CloudBuild: Microsoft's Distributed and Caching Build Service , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[58]  David Chenho Kung,et al.  Class Firewall, Test Order, and Regression Testing of Object-Oriented Programs , 1995, J. Object Oriented Program..

[59]  Darko Marinov,et al.  Ekstazi: Lightweight Test Selection , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[60]  Premkumar T. Devanbu,et al.  A large scale study of programming languages and code quality in github , 2014, SIGSOFT FSE.

[61]  Thomas Moyer,et al.  Trustworthy Whole-System Provenance for the Linux Kernel , 2015, USENIX Security Symposium.

[62]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[63]  Boris Beizer,et al.  Software testing techniques (2. ed.) , 1990 .

[64]  Premkumar T. Devanbu,et al.  Quality and productivity outcomes relating to continuous integration in GitHub , 2015, ESEC/SIGSOFT FSE.