Defects4J as a Challenge Case for the Search-Based Software Engineering Community

Defects4J is a collection of reproducible bugs, extracted from realworld Java software systems, together with a supporting infrastructure for using these bugs. Defects4J has been widely used to evaluate software engineering research, including research on automated test generation, program repair, and fault localization. Defects4J has recently grown substantially, both in number of software systems and number of bugs. This report proposes that Defects4J can serve as a benchmark for Search-Based Software Engineering (SBSE) research as well as a catalyst for new innovations. Specifically, it outlines the current Defects4J dataset and infrastructure, and details how it can serve as a challenge case to support SBSE research and to expand Defects4J itself.

[1]  Gregory Gay,et al.  Using Search-Based Test Generation to Discover Real Faults in Guava , 2017, SSBSE.

[2]  René Just,et al.  Unit Testing Tool Competition — Round Four , 2016, 2016 IEEE/ACM 9th International Workshop on Search-Based Software Testing (SBST).

[3]  Annibale Panichella,et al.  A Systematic Comparison of Search Algorithms for Topic Modelling - A Study on Duplicate Bug Report Identification , 2019, SSBSE.

[4]  Federica Sarro Search-Based Predictive Modelling for Software Engineering: How Far Have We Gone? , 2019, SSBSE.

[5]  Gregory Gay Detecting Real Faults in the Gson Library Through Search-Based Unit Test Generation , 2018, SSBSE.

[6]  René Just,et al.  The major mutation framework: efficient and scalable mutation analysis for Java , 2014, ISSTA 2014.

[7]  Michael D. Ernst,et al.  Evaluating and Improving Fault Localization , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[8]  Jerffeson Souza,et al.  Code Naturalness to Assist Search Space Exploration in Search-Based Program Repair Methods , 2019, SSBSE.

[9]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[10]  Thomas Vogel,et al.  Does Diversity Improve the Test Suite Generation for Mobile Applications? , 2019, SSBSE.

[11]  Márcio de Oliveira Barros,et al.  What Can a Big Program Teach Us about Optimization? , 2013, SSBSE.

[12]  Justyna Petke,et al.  Software Improvement with Gin: A Case Study , 2019, SSBSE.

[13]  Gregory Gay Challenges in Using Search-Based Test Generation to Identify Real Faults in Mockito , 2016, SSBSE.

[14]  Yuanyuan Zhang,et al.  Inferring Test Models from Kate's Bug Reports Using Multi-objective Search , 2015, SSBSE.

[15]  Gordon Fraser,et al.  Does Automated Unit Test Generation Really Help Software Testers? A Controlled Empirical Study , 2015, ACM Trans. Softw. Eng. Methodol..

[16]  Yuriy Brun,et al.  Quality of Automated Program Repair on Real-World Defects , 2022, IEEE Transactions on Software Engineering.

[17]  Hadi Hemmati,et al.  Revisiting Hyper-Parameter Tuning for Search-based Test Data Generation , 2019, SSBSE.

[18]  Michael D. Ernst,et al.  Comparing developer-provided to user-provided tests for fault localization and automated program repair , 2018, ISSTA.

[19]  Matias Martinez,et al.  Automatic repair of real bugs in java: a large-scale experiment on the defects4j dataset , 2016, Empirical Software Engineering.

[20]  Michael D. Ernst,et al.  Randoop: feedback-directed random testing for Java , 2007, OOPSLA '07.

[21]  Mark Harman,et al.  Babel Pidgin: SBSE Can Grow and Graft Entirely New Functionality into a Real World System , 2014, SSBSE.

[22]  Gordon Fraser,et al.  Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).