Longitudinal Analysis of the Applicability of Program Repair on Past Commits

The applicability of program repair in the real world is a little researched topic. Existing program repair systems tend to only be tested on small bug datasets, such as Defects4J, that are not fully representative of real world projects. In this paper, we report on a longitudinal analysis of software repositories to investigate if past commits are amenable to program repair. Our key insight is to compute whether or not a commit lies in the search space of program repair systems. For this purpose, we present RSCommitDetector, which gets a Git repository as input and after performing a series of static analyses, it outputs a list of commits whose corresponding source code changes could likely be generated by notable repair systems. We call these commits the ``repair-space commits'', meaning that they are considered in the search space of a repair system. Using RSCommitDetector, we conduct a study on $41,612$ commits from the history of $72$ Github repositories. The results of this study show that $1.77\%$ of these commits are repair-space commits, they lie in the search space of at least one of the eight repair systems we consider. We use an original methodology to validate our approach and show that the precision and recall of RSCommitDetector are $77\%$ and $92\%$, respectively. To our knowledge, this is the first study of the applicability of program repair with search space analysis.

[1]  Matias Martinez,et al.  Coming: A Tool for Mining Change Pattern Instances from Git Commits , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[2]  Andreas Zeller,et al.  The impact of tangled code changes , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[3]  Martin Monperrus,et al.  NPEFix: Automatic Runtime Repair of Null Pointer Exceptions in Java , 2015, ArXiv.

[4]  Jaechang Nam,et al.  Automatic patch generation learned from human-written patches , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[5]  Monperrus Martin Automatic Software Repair: a Bibliography , 2020 .

[6]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[7]  Sunghun Kim,et al.  Toward an understanding of bug fix patterns , 2009, Empirical Software Engineering.

[8]  Martin White,et al.  Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities , 2017, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[9]  Matias Martinez,et al.  Fine-grained and accurate source code differencing , 2014, ASE.

[10]  Hiroaki Yoshida,et al.  Elixir: Effective object-oriented program repair , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[11]  Markus Wagner,et al.  A survey of genetic improvement search spaces , 2019, GECCO.

[12]  Fabio Palomba,et al.  Fine-grained just-in-time defect prediction , 2019, J. Syst. Softw..

[13]  David Lo,et al.  A Deeper Look into Bug Fixes: Patterns, Replacements, Deletions, and Additions , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[14]  Matias Martinez,et al.  Do the fix ingredients already exist? an empirical inquiry into the redundancy assumptions of program repair approaches , 2014, ICSE Companion.

[15]  Matias Martinez,et al.  Mining software repair models for reasoning on the search space of automated program fixing , 2013, Empirical Software Engineering.

[16]  Claire Le Goues,et al.  Automatically finding patches using genetic programming , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[17]  Martin Monperrus,et al.  Automated Patch Assessment for Program Repair at Scale , 2019, ArXiv.

[18]  Gabriele Bavota,et al.  There and back again: Can you compile that snapshot? , 2017, J. Softw. Evol. Process..

[19]  Matias Martinez,et al.  Ultra-Large Repair Search Space with Automatically Mined Templates: The Cardumen Mode of Astor , 2017, SSBSE.

[20]  David Lo,et al.  History Driven Program Repair , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[21]  Rui Abreu,et al.  Empirical review of Java program repair tools: a large-scale experiment on 2,141 bugs and 23,551 repair attempts , 2019, ESEC/SIGSOFT FSE.

[22]  Claire Le Goues,et al.  Using a probabilistic model to predict bug fixes , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[23]  Md Rakibul Islam,et al.  How bugs are fixed: exposing bug-fix patterns with edits and nesting levels , 2020, SAC.

[24]  Yuriy Brun,et al.  The plastic surgery hypothesis , 2014, SIGSOFT FSE.

[25]  Lingming Zhang,et al.  Practical program repair via bytecode mutation , 2018, ISSTA.

[26]  Wolfgang Banzhaf,et al.  ARJA: Automated Repair of Java Programs via Multi-Objective Genetic Programming , 2017, IEEE Transactions on Software Engineering.

[27]  Fan Long,et al.  An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[28]  Matias Martinez,et al.  ASTOR: a program repair library for Java (demo) , 2016, ISSTA.

[29]  Wing Lam,et al.  Bugs.jar: A Large-Scale, Diverse Dataset of Real-World Java Bugs , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[30]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[31]  Martin Monperrus,et al.  Explainable Software Bot Contributions: Case Study of Automated Bug Fixes , 2019, 2019 IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE).

[32]  Marcelo de Almeida Maia,et al.  Towards an automated approach for bug fix pattern detection , 2018, ArXiv.

[33]  Hridesh Rajan,et al.  Boa: Ultra-Large-Scale Software Repository and Source-Code Mining , 2015, ACM Trans. Softw. Eng. Methodol..

[34]  Marcelo de Almeida Maia,et al.  Dissection of a bug dataset: Anatomy of 395 patches from Defects4J , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[35]  Fan Long,et al.  Staged program repair with condition synthesis , 2015, ESEC/SIGSOFT FSE.

[36]  Marcelo de Almeida Maia,et al.  BEARS: An Extensible Java Bug Benchmark for Automatic Program Repair Studies , 2019, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[37]  Collin McMillan,et al.  Automatically generating commit messages from diffs using neural machine translation , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[38]  Fan Long,et al.  An analysis of patch plausibility and correctness for generate-and-validate patch generation systems , 2015, ISSTA.

[39]  Fan Long,et al.  Automatic patch generation by learning correct code , 2016, POPL.

[40]  Jacques Klein,et al.  On the Efficiency of Test Suite based Program Repair A Systematic Assessment of 16 Automated Repair Systems for Java Programs , 2020, 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE).

[41]  Eduardo Cunha Campos,et al.  Common Bug-Fix Patterns: A Large-Scale Observational Study , 2017, 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).

[42]  Martin Monperrus,et al.  DynaMoth: Dynamic Code Synthesis for Automatic Program Repair , 2016, 2016 IEEE/ACM 11th International Workshop in Automation of Software Test (AST).

[43]  Matias Martinez ASTOR: A Program Repair Library for Java , 2016 .

[44]  Martin Monperrus,et al.  Dynamic patch generation for null pointer exceptions using metaprogramming , 2017, 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[45]  Westley Weimer,et al.  Leveraging program equivalence for adaptive program repair: Models and first results , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[46]  Martin Monperrus,et al.  Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs , 2018, IEEE Transactions on Software Engineering.

[47]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).