Validation of Automatically Generated Patches: An Appetizer

In the context of test case based automated program repair (APR), the research community call the patches that pass all the test cases but fail to actually fix the bug test case overfitted patches. Currently, overfitted patches has to be manually inspected by the users. Being a labor intensive activity that hinders widespread adoption of APR tools, automatic validation of APR-generated patches has been the topic of research in recent years. In this paper, we point out the limitations of the existing techniques/methodologies that call for further research, and introduce two promising directions toward effective automatic patch validation: (1) motivated by the relative effectiveness of anti-patterns, we propose to use statistical techniques to avoid the uncomputability of applying some of the anti-pattern rules and automate the technique. Our results show that we achieve at least 57% precision. (2) We present a proposal for a semi-automatic technique that helps the programmers in finding properties of the patched methods and stress testing the patches based on those properties so as to filter out overfitted ones as many as possible.

[1]  Sergio Segura,et al.  Automated inference of likely metamorphic relations for model transformations , 2018, J. Syst. Softw..

[2]  Matias Martinez ASTOR: A Program Repair Library for Java , 2016 .

[3]  Marat Boshernitsan,et al.  From daikon to agitator: lessons and challenges in building a commercial tool for developer testing , 2006, ISSTA '06.

[4]  Sarfraz Khurshid,et al.  Specification-Based Program Repair Using SAT , 2011, TACAS.

[5]  David Lo,et al.  History Driven Program Repair , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[6]  Michael D. Ernst,et al.  Defects4J: a database of existing faults to enable controlled testing studies for Java programs , 2014, ISSTA 2014.

[7]  Tsong Yueh Chen,et al.  Metamorphic Testing: A New Approach for Generating Next Test Cases , 2020, ArXiv.

[8]  Martin Monperrus,et al.  Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs , 2018, IEEE Transactions on Software Engineering.

[9]  Gregg Rothermel,et al.  An empirical investigation of program spectra , 1998, PASTE '98.

[10]  Abhik Roychoudhury,et al.  CoREBench: studying complexity of regression errors , 2014, ISSTA 2014.

[11]  Koen Claessen,et al.  QuickCheck: a lightweight tool for random testing of Haskell programs , 2011, SIGP.

[12]  Qi Xin,et al.  Identifying test-suite-overfitted patches through test case generation , 2017, ISSTA.

[13]  Gang Huang,et al.  Identifying Patch Correctness in Test-Based Program Repair , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[14]  Claire Le Goues,et al.  A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[15]  Tao Xie,et al.  DiffGen: Automated Regression Unit-Test Generation , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[16]  Xiang Gao,et al.  Crash-avoiding program repair , 2019, ISSTA.

[17]  Chanchal Kumar Roy,et al.  Comparison and evaluation of code clone detection techniques and tools: A qualitative approach , 2009, Sci. Comput. Program..

[18]  K. Rustan M. Leino,et al.  Annotation inference for modular checkers , 2001, Inf. Process. Lett..

[19]  Claire Le Goues,et al.  GenProg: A Generic Method for Automatic Software Repair , 2012, IEEE Transactions on Software Engineering.

[20]  Hiroaki Yoshida,et al.  Anti-patterns in search-based program repair , 2016, SIGSOFT FSE.

[21]  Matias Martinez,et al.  Alleviating patch overfitting with automatic test generation: a study of feasibility and effectiveness for the Nopol repair system , 2018, Empirical Software Engineering.

[22]  C. A. R. HOARE,et al.  An axiomatic basis for computer programming , 1969, CACM.

[23]  Abhik Roychoudhury,et al.  relifix: Automated Repair of Software Regressions , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[24]  Upulee Kanewala,et al.  Predicting Metamorphic Relations for Matrix Calculation Programs , 2018, 2018 IEEE/ACM 3rd International Workshop on Metamorphic Testing (MET).

[25]  Dawei Qi,et al.  SemFix: Program repair via semantic analysis , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[26]  Dinh Xuan Bach Le Overfitting in automated program repair: Challenges and solutions , 2018 .

[27]  Westley Weimer,et al.  Leveraging program equivalence for adaptive program repair: Models and first results , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[28]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[29]  Simon Urli,et al.  How to Design a Program Repair Bot? Insights from the Repairnator Project , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[30]  James M. Bieman,et al.  Using machine learning techniques to detect metamorphic relations for programs without test oracles , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[31]  Jiachen Zhang,et al.  Precise Condition Synthesis for Program Repair , 2016, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[32]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[33]  Elaine J. Weyuker,et al.  On Testing Non-Testable Programs , 1982, Comput. J..

[34]  Paul Hudak,et al.  A gentle introduction to Haskell , 1992, SIGP.

[35]  Alexey Zhikhartsev,et al.  Better test cases for better automated program repair , 2017, ESEC/SIGSOFT FSE.

[36]  Fan Long,et al.  Staged program repair with condition synthesis , 2015, ESEC/SIGSOFT FSE.

[37]  Bogdan Korel,et al.  Automated regression test generation , 1998, ISSTA '98.

[38]  Gail E. Kaiser,et al.  Dynamic Inference of Likely Metamorphic Properties to Support Differential Testing , 2015, 2015 IEEE/ACM 10th International Workshop on Automation of Software Test.

[39]  K. Rustan M. Leino,et al.  Houdini, an Annotation Assistant for ESC/Java , 2001, FME.

[40]  Abhik Roychoudhury,et al.  Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[41]  Thomas Ball,et al.  Modular and verified automatic program repair , 2012, OOPSLA '12.

[42]  Xiaoguang Mao,et al.  Automated Program Repair by Using Similar Code Containing Fix Ingredients , 2016, 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC).

[43]  Zohar Manna,et al.  Mathematical Theory of Computation , 2003 .

[44]  Fan Long,et al.  An analysis of patch plausibility and correctness for generate-and-validate patch generation systems , 2015, ISSTA.

[45]  Hongyu Zhang,et al.  Shaping program repair space with existing patches and similar code , 2018, ISSTA.

[46]  Thomas J. Mowbray,et al.  AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis , 1998 .

[47]  Ming Wen,et al.  Context-Aware Patch Generation for Better Automated Program Repair , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[48]  Lingming Zhang,et al.  Practical program repair via bytecode mutation , 2018, ISSTA.

[49]  Chanchal Kumar Roy,et al.  SeByte: A semantic clone detection tool for intermediate languages , 2012, 2012 20th IEEE International Conference on Program Comprehension (ICPC).

[50]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[51]  Carlo A. Furia,et al.  Contract-based program repair without the contracts , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[52]  Gordon Fraser,et al.  EvoSuite: automatic test suite generation for object-oriented software , 2011, ESEC/FSE '11.

[53]  Upulee Kanewala,et al.  Using Semi-Supervised Learning for Predicting Metamorphic Relations , 2018, 2018 IEEE/ACM 3rd International Workshop on Metamorphic Testing (MET).

[54]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[55]  Fan Long,et al.  Automatic patch generation by learning correct code , 2016, POPL.

[56]  W. Eric Wong,et al.  Using Mutation to Automatically Suggest Fixes for Faulty Programs , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[57]  Abhik Roychoudhury,et al.  Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[58]  Daniela Micucci,et al.  Automatic Software Repair: A Survey , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[59]  Peter Zoeteweij,et al.  An Evaluation of Similarity Coefficients for Software Fault Localization , 2006, 2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06).

[60]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[61]  Claire Le Goues,et al.  Automated program repair , 2019, Commun. ACM.

[62]  David Lo,et al.  Overfitting in semantics-based automated program repair , 2018, Empirical Software Engineering.

[63]  Sergio Segura,et al.  A Survey on Metamorphic Testing , 2016, IEEE Transactions on Software Engineering.

[64]  Andreas Zeller,et al.  Generating Fixes from Object Behavior Anomalies , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[65]  Qi Xin Towards Addressing the Patch Overfitting Problem , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C).

[66]  Johannes Bader,et al.  Getafix: learning to fix bugs automatically , 2019, Proc. ACM Program. Lang..

[67]  Andreas Zeller,et al.  Automated Fixing of Programs with Contracts , 2014 .

[68]  Fan Long,et al.  Automatic runtime error repair and containment via recovery shepherding , 2014, PLDI.

[69]  K. Rustan M. Leino,et al.  Extended static checking , 1998, PROCOMET.

[70]  Matt Bishop,et al.  Property-based testing: a new approach to testing for assurance , 1997, SOEN.