Automatic Code Review by Learning the Revision of Source Code

Code review is the process of manual inspection on the revision of the source code in order to find out whether the revised source code eventually meets the revision requirements. However, manual code review is time-consuming, and automating such the code review process will alleviate the burden of code reviewers and speed up the software maintenance process. To construct the model for automatic code review, the characteristics of the revisions of source code (i.e., the difference between the two pieces of source code) should be properly captured and modeled. Unfortunately, most of the existing techniques can easily model the overall correlation between two pieces of source code, but not for the “difference” between two pieces of source code. In this paper, we propose a novel deep model named DACE for automatic code review. Such a model is able to learn revision features by contrasting the revised hunks from the original and revised source code with respect to the code context containing the hunks. Experimental results on six open source software projects indicate by learning the revision features, DACE can outperform the competing approaches in automatic code review.

[1]  Xinli Yang,et al.  Deep Learning for Just-in-Time Defect Prediction , 2015, 2015 IEEE International Conference on Software Quality, Reliability and Security.

[2]  Anh Tuan Nguyen,et al.  Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports (N) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[3]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[4]  Tao Wang,et al.  Convolutional Neural Networks over Tree Structures for Programming Language Processing , 2014, AAAI.

[5]  Nicole Novielli,et al.  Confusion Detection in Code Reviews , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[6]  Christian Bird,et al.  Automatically Recommending Peer Reviewers in Modern Code Review , 2016, IEEE Transactions on Software Engineering.

[7]  Ming Li,et al.  Supervised Deep Features for Software Functional Clone Detection by Exploiting Lexical and Syntactical Information in Source Code , 2017, IJCAI.

[8]  Hajimu Iida,et al.  Who should review my code? A file location-based code-reviewer recommendation approach for Modern Code Review , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[9]  Tim Menzies,et al.  On the use of relevance feedback in IR-based concept location , 2009, 2009 IEEE International Conference on Software Maintenance.

[10]  L. Eon Bottou Online Learning and Stochastic Approximations , 1998 .

[11]  Christian Bird,et al.  Convergent contemporary software peer review practices , 2013, ESEC/FSE 2013.

[12]  Daniel M. German,et al.  Open source software peer review practices , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[13]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[14]  Zhi-Hua Zhou,et al.  Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code , 2016, IJCAI.

[15]  Denys Poshyvanyk,et al.  Feature location via information retrieval based filtering of a single scenario execution trace , 2007, ASE.

[16]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[17]  Chan-Gun Lee,et al.  Applying deep learning based automatic bug triager to industrial projects , 2017, ESEC/SIGSOFT FSE.

[18]  Jane Cleland-Huang,et al.  Semantically Enhanced Software Traceability Using Deep Learning Techniques , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[19]  Jian Zhou,et al.  Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[20]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[21]  Ming Li,et al.  Enhancing the Unified Features to Locate Buggy Files by Exploiting the Sequential Nature of Source Code , 2017, IJCAI.

[22]  M. Uihlein Open , 2018 .

[23]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[24]  Zhenchang Xing,et al.  Predicting semantically linkable knowledge in developer online forums via convolutional neural network , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).