The effectiveness of context-based change application on automatic program repair

An Automatic Program Repair (APR) technique is an implementation of a repair model to fix a given bug by modifying program behavior. Recently, repair models which collect source code and code changes from software history and use such collected resources for patch generation became more popular. Collected resources are used to expand the patch search space and to increase the probability that correct patches for bugs are included in the space. However, it is also revealed that navigation on such expanded patch search space is difficult due to the sparseness of correct patches in the space. In this study, we evaluate the effectiveness of Context-based Change Application (CCA) technique on change selection, fix location selection and change concretization, which are the key aspects of navigating patch search space. CCA collects abstract subtree changes and their AST contexts, and applies them to fix locations only if their contexts are matched. CCA repair model can address both search space expansion and navigation issues, by expanding search space with collected changes while narrowing down search areas in the search space based on contexts. Since CCA applies changes to a fix location only if their contexts are matched, it only needs to consider the same context changes for each fix location. Also, if there is no change with the same context as a fix location, this fix location can be ignored since it means that past patches did not modify such locations. In addition, CCA uses fine-grained changes preserving changed code structures, but normalizing user-defined names. Hence change concretization can be simply done by replacing normalized names with concrete names available in buggy code. We evaluated CCA’s effectiveness with over 54K unique collected changes (221K in total) from about 5K human-written patches. Results show that using contexts, CCA correctly found 90.1% of the changes required for test set patches, while fewer than 5% of the changes were found without contexts. We discovered that collecting more changes is only helpful if it is supported by contexts for effective search space navigation. In addition, CCA repair model found 44-70% of the actual fix locations of Defects4j patches more quickly compared to using SBFL techniques only. We also found that about 48% of the patches can be fully concretized using concrete names from buggy code.

[1]  Eric Lahtinen,et al.  Automatic error elimination by horizontal code transfer across multiple applications , 2015, PLDI.

[2]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[3]  Fan Long,et al.  Automatic patch generation by learning correct code , 2016, POPL.

[4]  Michael D. Ernst,et al.  Evaluating and Improving Fault Localization , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[5]  David Leon,et al.  Dex: a semantic-graph differencing tool for studying changes in large code bases , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[6]  Zhendong Su,et al.  A study of the uniqueness of source code , 2010, FSE '10.

[7]  A. S. Meyer,et al.  Comparison of similarity coefficients used for cluster analysis with dominant markers in maize (Zea mays L) , 2004 .

[8]  Westley Weimer,et al.  Leveraging program equivalence for adaptive program repair: Models and first results , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[9]  W. Eric Wong,et al.  Using Mutation to Automatically Suggest Fixes for Faulty Programs , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[10]  Benjamin Livshits,et al.  DynaMine: finding common error patterns by mining software revision histories , 2005, ESEC/FSE-13.

[11]  Abhik Roychoudhury,et al.  DirectFix: Looking for Simple Program Repairs , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[12]  Matias Martinez,et al.  Mining software repair models for reasoning on the search space of automated program fixing , 2013, Empirical Software Engineering.

[13]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[14]  Yuhua Qi,et al.  Efficient Automated Program Repair through Fault-Recorded Testing Prioritization , 2013, 2013 IEEE International Conference on Software Maintenance.

[15]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[16]  Hongyu Zhang,et al.  Shaping program repair space with existing patches and similar code , 2018, ISSTA.

[17]  Christian Bird,et al.  The Uniqueness of Changes: Characteristics and Applications , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[18]  Gilles Roussel,et al.  Syntax tree fingerprinting: a foundation for source code similarity detection , 2009 .

[19]  Xin Yao,et al.  A novel co-evolutionary approach to automatic software bug fixing , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[20]  Abhik Roychoudhury,et al.  relifix: Automated Repair of Software Regressions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[21]  David Lo,et al.  S3: syntax- and semantic-guided repair synthesis via programming by examples , 2017, ESEC/SIGSOFT FSE.

[22]  Matias Martinez,et al.  Fine-grained and accurate source code differencing , 2014, ASE.

[23]  Na Meng,et al.  Towards reusing hints from past fixes , 2017, Empirical Software Engineering.

[24]  Matias Martinez,et al.  Do the fix ingredients already exist? an empirical inquiry into the redundancy assumptions of program repair approaches , 2014, ICSE Companion.

[25]  Sunghun Kim,et al.  Partitioning Composite Code Changes to Facilitate Code Review , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[26]  Martin Monperrus,et al.  Automatic repair of buggy if conditions and missing preconditions with SMT , 2014, CSTVA 2014.

[27]  Zhendong Su,et al.  An Empirical Study on Real Bug Fixes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[28]  Dongmei Zhang,et al.  How do software engineers understand code changes?: an exploratory study in industry , 2012, SIGSOFT FSE.

[29]  Name M. Lastname Automatically Finding Patches Using Genetic Programming , 2013 .

[30]  Yuriy Brun,et al.  Is the cure worse than the disease? overfitting in automated program repair , 2015, ESEC/SIGSOFT FSE.

[31]  Mark Harman,et al.  Using Genetic Improvement and Code Transplants to Specialise a C++ Program to a Problem Class , 2014, EuroGP.

[32]  Yuriy Brun,et al.  Repairing Programs with Semantic Code Search (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[33]  Abhik Roychoudhury,et al.  Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[34]  Fan Long,et al.  Automatic inference of code transforms for patch generation , 2017, ESEC/SIGSOFT FSE.

[35]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[36]  Hridesh Rajan,et al.  A study of repetitiveness of code changes in software evolution , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[37]  Tegawendé F. Bissyandé,et al.  AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations , 2018, 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[38]  Hiroaki Yoshida,et al.  Elixir: Effective object-oriented program repair , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[39]  David Lo,et al.  History Driven Program Repair , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[40]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[41]  Claire Le Goues,et al.  JFIX: semantics-based repair of Java programs via symbolic PathFinder , 2017, ISSTA.

[42]  Miryung Kim,et al.  Systematic editing: generating program transformations from an example , 2011, PLDI '11.

[43]  Miryung Kim,et al.  Lase: Locating and applying systematic edits by learning from examples , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[44]  Adam Lipowski,et al.  Roulette-wheel selection via stochastic acceptance , 2011, ArXiv.

[45]  Qi Xin,et al.  Leveraging syntax-related code for automated program repair , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[46]  Matias Martinez,et al.  Automatically Extracting Instances of Code Change Patterns with AST Analysis , 2013, 2013 IEEE International Conference on Software Maintenance.

[47]  Fan Long,et al.  An analysis of patch plausibility and correctness for generate-and-validate patch generation systems , 2015, ISSTA.

[48]  Yuriy Brun,et al.  The plastic surgery hypothesis , 2014, SIGSOFT FSE.

[49]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[50]  Michael D. Ernst,et al.  Automatically patching errors in deployed software , 2009, SOSP '09.

[51]  Mark Harman,et al.  Automated software transplantation , 2015, ISSTA.

[52]  Sumit Gulwani,et al.  Learning Syntactic Program Transformations from Examples , 2016, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[53]  Fan Long,et al.  Staged program repair with condition synthesis , 2015, ESEC/SIGSOFT FSE.

[54]  Jaechang Nam,et al.  Automatic patch generation learned from human-written patches , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[55]  Gilles Roussel,et al.  Syntax tree fingerprinting for source code similarity detection , 2009, 2009 IEEE 17th International Conference on Program Comprehension.

[56]  Miryung Kim,et al.  Sydit: creating and applying a program transformation from an example , 2011, ESEC/FSE '11.

[57]  Yuhua Qi,et al.  The strength of random search on automated program repair , 2014, ICSE.

[58]  David Lo,et al.  Overfitting in semantics-based automated program repair , 2018, Empirical Software Engineering.

[59]  Eunseok Lee,et al.  VFL: Variable-based fault localization , 2019, Inf. Softw. Technol..

[60]  Ming Wen,et al.  Context-Aware Patch Generation for Better Automated Program Repair , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[61]  Fan Long,et al.  An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).