Repairing Programs with Semantic Code Search

Automated program repair can potentially reduce debugging costs and improve software quality but recent studies have drawn attention to shortcomings in the quality of automatically generated repairs. We propose a new kind of repair that uses the large body of existing open-source code to find potential fixes. The key challenges lie in efficiently finding code semantically similar (but not identical) to defective code and then appropriately integrating that code into a buggy program. We present SearchRepair, a repair technique that addresses these challenges by (1) encoding a large database of human-written code fragments as SMT constraints on input-output behavior, (2) localizing a given defect to likely buggy program fragments and deriving the desired input-output behavior for code to replace those fragments, (3) using state-of-the-art constraint solvers to search the database for fragments that satisfy that desired behavior and replacing the likely buggy code with these potential patches, and (4) validating that the patches repair the bug against program test suites. We find that SearchRepair repairs 150 (19%) of 778 benchmark C defects written by novice students, 20 of which are not repaired by GenProg, TrpAutoRepair, and AE. We compare the quality of the patches generated by the four techniques by measuring how many independent, not-used-during-repair tests they pass, and find that SearchRepair-repaired programs pass 97.3% of the tests, on average, whereas GenProg-, TrpAutoRepair-, and AE-repaired programs pass 68.7%, 72.1%, and 64.2% of the tests, respectively. We conclude that SearchRepair produces higher-quality repairs than GenProg, TrpAutoRepair, and AE, and repairs some defects

[1]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[2]  Lori A. Clarke,et al.  A System to Generate Test Data and Symbolically Execute Programs , 1976, IEEE Transactions on Software Engineering.

[3]  Lori A. Clarke,et al.  Applications of symbolic evaluation , 1985, J. Syst. Softw..

[4]  Andy Podgurski,et al.  Retrieving reusable software by sampling behavior , 1993, TSEM.

[5]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[6]  S Forrest,et al.  Genetic algorithms , 1996, CSUR.

[7]  Jeannette M. Wing,et al.  Specification matching of software components , 1997 .

[8]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[9]  Gregory Tassey,et al.  Prepared for what , 2007 .

[10]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[11]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[12]  John Penix,et al.  Efficient Specification-Based Component Retrieval , 1999, Automated Software Engineering.

[13]  Angelos D. Keromytis,et al.  Countering network worms through automatic patch generation , 2005, IEEE Security & Privacy Magazine.

[14]  Gail C. Murphy,et al.  Coping with an open bug repository , 2005, eclipse '05.

[15]  Tzi-cker Chiueh,et al.  DIRA: Automatic Detection, Identification and Repair of Control-Hijacking Attacks , 2005, NDSS.

[16]  Stephen McCamant,et al.  Inference and enforcement of data structure consistency specifications , 2006, ISSTA '06.

[17]  Westley Weimer,et al.  Patches as better bug reports , 2006, GPCE '06.

[18]  Westley Weimer,et al.  Modeling bug report quality , 2007, ASE '07.

[19]  Andreas Zeller,et al.  How Long Will It Take to Fix This Bug? , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[20]  Mark Harman,et al.  The Current State and Future of Search Based Software Engineering , 2007, Future of Software Engineering (FOSE '07).

[21]  Paola Inverardi,et al.  SYNTHESIS: A Tool for Automatically Assembling Correct and Distributed Component-Based Systems , 2007, 29th International Conference on Software Engineering (ICSE'07).

[22]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[23]  Xin Yao,et al.  A novel co-evolutionary approach to automatic software bug fixing , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[24]  Sarfraz Khurshid,et al.  Juzi: a tool for repairing complex data structures , 2008, ICSE.

[25]  Michael W. Godfrey,et al.  “Cloning considered harmful” considered harmful: patterns of cloning in software , 2008, Empirical Software Engineering.

[26]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[27]  Michael D. Ernst,et al.  Automatically patching errors in deployed software , 2009, SOSP '09.

[28]  Steven P. Reiss,et al.  Semantics-based code search , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[29]  Andreas Zeller,et al.  Generating Fixes from Object Behavior Anomalies , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[30]  Rajiv Gupta,et al.  BugFix: A learning-based tool to assist developers in fixing bugs , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[31]  Daniel R. Tauritz,et al.  Coevolutionary automated software correction , 2010, GECCO '10.

[32]  Sumit Gulwani,et al.  Oracle-guided component-based program synthesis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[33]  Claire Le Goues,et al.  Automatic program repair with evolutionary computation , 2010, Commun. ACM.

[34]  Zhendong Su,et al.  A study of the uniqueness of source code , 2010, FSE '10.

[35]  Carlo Ghezzi,et al.  Behavior model based component search: an initial assessment , 2010, SUITE '10.

[36]  Alessandra Gorla,et al.  Automatic workarounds for web applications , 2010, FSE '10.

[37]  W. Eric Wong,et al.  Using Mutation to Automatically Suggest Fixes for Faulty Programs , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[38]  Shan Lu,et al.  Automated atomicity-violation fixing , 2011, PLDI '11.

[39]  Martin C. Rinard,et al.  Detecting and Escaping Infinite Loops with Jolt , 2011, ECOOP.

[40]  Sumit Gulwani,et al.  Synthesizing geometry constructions , 2011, PLDI '11.

[41]  Moshe Sipper,et al.  Flight of the FINCH Through the Java Wilderness , 2011, IEEE Transactions on Evolutionary Computation.

[42]  Sarfraz Khurshid,et al.  Specification-Based Program Repair Using SAT , 2011, TACAS.

[43]  Westley Weimer,et al.  A human study of patch maintainability , 2012, ISSTA 2012.

[44]  Charles Zhang,et al.  Axis: Automatically fixing atomicity violations through solving control constraints , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[45]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[46]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[47]  Zhendong Su,et al.  Testing mined specifications , 2012, SIGSOFT FSE.

[48]  Kathryn T. Stolee,et al.  Toward semantic search via SMT solver , 2012, SIGSOFT FSE.

[49]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[50]  Yuhua Qi,et al.  Efficient Automated Program Repair through Fault-Recorded Testing Prioritization , 2013, 2013 IEEE International Conference on Software Maintenance.

[51]  Zack Coker,et al.  Program transformations to fix C integers , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[52]  Jaechang Nam,et al.  Automatic patch generation learned from human-written patches , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[53]  Westley Weimer,et al.  Leveraging program equivalence for adaptive program repair: Models and first results , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[54]  Gabriele Bavota,et al.  Query quality prediction and reformulation for source code search: The Refoqus tool , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[55]  Alessandra Gorla,et al.  Automatic recovery from runtime failures , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[56]  Name M. Lastname Automatically Finding Patches Using Genetic Programming , 2013 .

[57]  Automated Fixing of Programs with Contracts , 2010, IEEE Transactions on Software Engineering.

[58]  Yuriy Brun,et al.  The plastic surgery hypothesis , 2014, SIGSOFT FSE.

[59]  Kathryn T. Stolee,et al.  Solving the Search for Source Code , 2014, ACM Trans. Softw. Eng. Methodol..

[60]  Tevfik Bultan,et al.  Semantic differential repair for input validation and sanitization , 2014, ISSTA 2014.

[61]  Koushik Sen,et al.  CodeHint: dynamic and interactive synthesis of code snippets , 2014, ICSE.

[62]  Sandeep S. Kulkarni,et al.  Automatic repair for multi-threaded programs with Deadlock/Livelock using maximum satisfiability , 2014, ISSTA 2014.

[63]  Ying Zou,et al.  Spotting working code examples , 2014, ICSE.

[64]  Yuhua Qi,et al.  The strength of random search on automated program repair , 2014, ICSE.

[65]  Charles Zhang,et al.  Grail: context-aware fixing of concurrency bugs , 2014, SIGSOFT FSE.

[66]  Fan Long,et al.  An analysis of patch plausibility and correctness for generate-and-validate patch generation systems , 2015, ISSTA.

[67]  Eric Lahtinen,et al.  Automatic error elimination by horizontal code transfer across multiple applications , 2015, PLDI.

[68]  Matias Martinez,et al.  Automatic Repair of Real Bugs: An Experience Report on the Defects4J Dataset , 2015, ArXiv.

[69]  Mark Harman,et al.  Automated software transplantation , 2015, ISSTA.

[70]  Abhik Roychoudhury,et al.  DirectFix: Looking for Simple Program Repairs , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[71]  Martin Rinard,et al.  Horizontal Code Transfer via Program Fracture and Recombination , 2015 .

[72]  Yuriy Brun,et al.  Is the cure worse than the disease? overfitting in automated program repair , 2015, ESEC/SIGSOFT FSE.

[73]  Abhik Roychoudhury,et al.  relifix: Automated Repair of Software Regressions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[74]  Yuriy Brun,et al.  The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs , 2015, IEEE Transactions on Software Engineering.