Fix-Filter-Fix: Intuitively Connect Any Models for Effective Bug Fixing

Locating and fixing bugs is a time-consuming task. Most neural machine translation (NMT) based approaches for automatically bug fixing lack generality and do not make full use of the rich information in the source code. In NMT-based bug fixing, we find some predicted code identical to the input buggy code (called unchanged fix) in NMT-based approaches due to high similarity between buggy and fixed code (e.g., the difference may only appear in one particular line). Obviously, unchanged fix is not the correct fix because it is the same as the buggy code that needs to be fixed. Based on these, we propose an intuitive yet effective general framework (called Fix-Filter-Fix or Fˆ3) for bug fixing. Fˆ3 connects models with our filter mechanism to filter out the last model’s unchanged fix to the next. We propose an Fˆ3 theory that can quantitatively and accurately calculate the Fˆ3 lifting effect. To evaluate, we implement the Seq2Seq Transformer (ST) and the AST2Seq Transformer (AT) to form some basic Fˆ3 instances, called Fˆ3_ST+AT and Fˆ3_AT+ST. Comparing them with single model approaches and many model connection baselines across four datasets validates the effectiveness and generality of Fˆ3 and corroborates our findings and methodology.

[1]  Baishakhi Ray,et al.  CODIT: Code Editing With Tree-Based Neural Models , 2018, IEEE Transactions on Software Engineering.

[2]  Denys Poshyvanyk,et al.  SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair , 2018, IEEE Transactions on Software Engineering.

[3]  Yitong Li,et al.  CoCoNuT: combining context-aware neural translation models using ensemble for program repair , 2020, ISSTA.

[4]  Andrew Rice,et al.  Learning to Fix Build Errors with Graph2Diff Neural Networks , 2019, ICSE.

[5]  Ali Mesbah,et al.  DeepDelta: learning to repair compilation errors , 2019, ESEC/SIGSOFT FSE.

[6]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[7]  Gabriele Bavota,et al.  An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation , 2018, ACM Trans. Softw. Eng. Methodol..

[8]  Li Li,et al.  Watch out for this commit! A study of influential software changes , 2016, J. Softw. Evol. Process..

[9]  Hongyu Zhang,et al.  Shaping program repair space with existing patches and similar code , 2018, ISSTA.

[10]  Martin Monperrus,et al.  The CodRep Machine Learning on Source Code Competition , 2018, ArXiv.

[11]  Sumit Gulwani,et al.  Compilation Error Repair: For the Student Programs, From the Student Programs , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET).

[12]  Rahul Gupta,et al.  DeepFix: Fixing Common C Language Errors by Deep Learning , 2017, AAAI.

[13]  Gabriele Bavota,et al.  An empirical study on developer‐related factors characterizing fix‐inducing commits , 2017, J. Softw. Evol. Process..

[14]  F. Schweitzer,et al.  From Aristotle to Ringelmann: a large-scale analysis of team productivity and coordination in Open Source Software projects , 2016, Empirical Software Engineering.

[15]  Zhendong Su,et al.  An Empirical Study on Real Bug Fixes , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[16]  Matias Martinez,et al.  CVS-Vintage: A Dataset of 14 CVS Repositories of Java Software , 2012 .

[17]  Osamu Mizuno,et al.  Bug prediction based on fine-grained module histories , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[18]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).