Automatically finding patches using genetic programming

Automatic program repair has been a longstanding goal in software engineering, yet debugging remains a largely manual process. We introduce a fully automated method for locating and repairing bugs in software. The approach works on off-the-shelf legacy applications and does not require formal specifications, program annotations or special coding practices. Once a program fault is discovered, an extended form of genetic programming is used to evolve program variants until one is found that both retains required functionality and also avoids the defect in question. Standard test cases are used to exercise the fault and to encode program requirements. After a successful repair has been discovered, it is minimized using structural differencing algorithms and delta debugging. We describe the proposed method and report experimental results demonstrating that it can successfully repair ten different C programs totaling 63,000 lines in under 200 seconds, on average.

[1]  Mayur Naik,et al.  From symptom to cause: localizing errors in counterexample traces , 2003, POPL '03.

[2]  A. E. Eiben,et al.  Introduction to Evolutionary Computing , 2003, Natural Computing Series.

[3]  Zhendong Su,et al.  Symbolic mining of temporal specifications , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[4]  Joachim Wegener,et al.  Evolutionary unit testing of object-oriented software using strongly-typed genetic programming , 2006, GECCO '06.

[5]  Westley Weimer,et al.  Patches as better bug reports , 2006, GPCE '06.

[6]  Thomas M. Pigoski Practical Software Maintenance: Best Practices for Managing Your Software Investment , 1996 .

[7]  S Forrest,et al.  Genetic algorithms , 1996, CSUR.

[8]  Gail C. Murphy,et al.  Who should fix this bug? , 2006, ICSE.

[9]  Grace A. Lewis,et al.  Modernizing Legacy Systems - Software Technologies, Engineering Processes, and Business Practices , 2003, SEI series in software engineering.

[10]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[11]  John R. Koza,et al.  Human-competitive results produced by genetic programming , 2010, Genetic Programming and Evolvable Machines.

[12]  Alex Groce,et al.  Explaining abstract counterexamples , 2004, SIGSOFT '04/FSE-12.

[13]  George C. Necula,et al.  CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.

[14]  Rupak Majumdar,et al.  Path slicing , 2005, PLDI '05.

[15]  Steven M. Gustafson An analysis of diversity in genetic programming , 2004 .

[16]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[17]  Gary McGraw,et al.  Generating Software Test Data by Evolution , 2001, IEEE Trans. Software Eng..

[18]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[19]  Stephen McCamant,et al.  Inference and enforcement of data structure consistency specifications , 2006, ISSTA '06.

[20]  Andreas Zeller,et al.  Yesterday, my program worked. Today, it does not. Why? , 1999, ESEC/FSE-7.

[21]  Mary Lou Soffa,et al.  TimeAware test suite prioritization , 2006, ISSTA '06.

[22]  Olga Baysal,et al.  diffX: an algorithm to detect changes in multi-version XML documents , 2005, CASCON.

[23]  Andrea Arcuri,et al.  On the automation of fixing software bugs , 2008, ICSE Companion '08.

[24]  C.V. Ramamoorthy,et al.  Advances in Software Engineering , 1996, Computer.

[25]  Enrique Alba,et al.  Finding safety errors with ACO , 2007, GECCO '07.

[26]  C. Werner,et al.  Staffing a software project , 2005, ACM SIGSOFT Softw. Eng. Notes.

[27]  Xin Yao,et al.  A novel co-evolutionary approach to automatic software bug fixing , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[28]  Terry Van Belle,et al.  Code Factoring And The Evolution Of Evolvability , 2002, GECCO.

[29]  Johannes Stammel,et al.  Search-based determination of refactorings for improving the class structure of object-oriented systems , 2006, GECCO.

[30]  Claire Le Goues,et al.  Specification Mining with Few False Positives , 2009, TACAS.

[31]  Mark Harman,et al.  The Current State and Future of Search Based Software Engineering , 2007, Future of Software Engineering (FOSE '07).

[32]  Gail C. Murphy,et al.  Coping with an open bug repository , 2005, eclipse '05.

[33]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[34]  Alex Groce,et al.  Making the Most of BMC Counterexamples , 2005, BMC@CAV.

[35]  Daniel M. Roy,et al.  Enhancing Server Availability and Security Through Failure-Oblivious Computing , 2004, OSDI.

[36]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[37]  Graham Kendall,et al.  Problem Difficulty and Code Growth in Genetic Programming , 2004, Genetic Programming and Evolvable Machines.

[38]  Andy Chou,et al.  Bugs as Inconsistent Behavior: A General Approach to Inferring Errors in Systems Code. , 2001, SOSP 2001.