LSTM-based argument recommendation for non-API methods

Automatic code completion is one of the most useful features provided by advanced IDEs. Argument recommendation, as a special kind of code completion, is widely used as well. While existing approaches focus on argument recommendation for popular APIs, a large number of non-API invocations are requesting for accurate argument recommendation as well. To this end, we propose an LSTM-based approach to recommending non-API arguments instantly when method calls are typed in. With data collected from a large corpus of open-source applications, we train an LSTM neural network to recommend actual arguments based on identifiers of the invoked method, the corresponding formal parameter, and a list of syntactically correct candidate arguments. To feed these identifiers into the LSTM neural network, we convert them into fixed-length vectors by Paragraph Vector, an unsupervised neural network based learning algorithm. With the resulting LSTM neural network trained on sample applications, for a given call site we can predict which of the candidate arguments is more likely to be the correct one. We evaluate the proposed approach with tenfold validation on 85 open-source C applications. Results suggest that the proposed approach outperforms the state-of-the-art approaches in recommending non-API arguments. It improves the precision significantly from 71.46% to 83.37%.

[1]  Ben Shneiderman,et al.  Split menus: effectively using selection frequency to organize menus , 1994, TCHI.

[2]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[3]  Chanchal Kumar Roy,et al.  Exploring API method parameter recommendations , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[4]  Yijun Yu,et al.  Improving the Tokenisation of Identifier Names , 2011, ECOOP.

[5]  Andreas Krause,et al.  Predicting Program Properties from "Big Code" , 2015, POPL.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Swarat Chaudhuri,et al.  Neural Sketch Learning for Conditional Program Generation , 2017, ICLR.

[8]  Martin P. Robillard,et al.  Recommendation Systems for Software Engineering , 2010, IEEE Software.

[9]  Yi Zhang,et al.  Automatic parameter recommendation for practical API usage , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[10]  R. J. Kuo,et al.  Integration of particle swarm optimization and genetic algorithm for dynamic clustering , 2012, Inf. Sci..

[11]  Charles A. Sutton,et al.  Suggesting accurate method and class names , 2015, ESEC/SIGSOFT FSE.

[12]  Hui Liu,et al.  Deep Learning Based Code Smell Detection , 2021, IEEE Transactions on Software Engineering.

[13]  Martin White,et al.  Toward Deep Learning Software Repositories , 2015, 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories.

[14]  Charles A. Sutton,et al.  Learning natural coding conventions , 2014, SIGSOFT FSE.

[15]  Premkumar T. Devanbu,et al.  Are deep neural networks the best choice for modeling source code? , 2017, ESEC/SIGSOFT FSE.

[16]  C MurphyGail,et al.  How Are Java Software Developers Using the Eclipse IDE , 2006 .

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[19]  Matthijs Douze,et al.  FastText.zip: Compressing text classification models , 2016, ArXiv.

[20]  Mik Kersten,et al.  How are Java software developers using the Elipse IDE? , 2006, IEEE Software.

[21]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[22]  Eran Yahav,et al.  Code completion with statistical language models , 2014, PLDI.

[23]  Ke Wang,et al.  Dynamic Neural Program Embedding for Program Repair , 2017, ICLR.

[24]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[25]  Mingmin Chi,et al.  Long Short-Term Memory With Quadratic Connections in Recursive Neural Networks for Representing Compositional Semantics , 2017, IEEE Access.

[26]  Premkumar T. Devanbu,et al.  On the naturalness of software , 2016, Commun. ACM.

[27]  Premkumar T. Devanbu,et al.  On the localness of software , 2014, SIGSOFT FSE.

[28]  Yue Luo,et al.  Nomen est Omen: Exploring and Exploiting Similarities between Argument and Parameter Names , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[29]  Koushik Sen,et al.  DeepBugs: a learning approach to name-based bug detection , 2018, Proc. ACM Program. Lang..

[30]  Hui Liu,et al.  Deep Learning Based Feature Envy Detection , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[31]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.