DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning

Computer programs written in one language are often required to be ported to other languages to support multiple devices and environments. When programs use language specific APIs (Application Programming Interfaces), it is very challenging to migrate these APIs to the corresponding APIs written in other languages. Existing approaches mine API mappings from projects that have corresponding versions in two languages. They rely on the sparse availability of bilingual projects, thus producing a limited number of API mappings. In this paper, we propose an intelligent system called DeepAM for automatically mining API mappings from a large-scale code corpus without bilingual projects. The key component of DeepAM is based on the multimodal sequence to sequence learning architecture that aims to learn joint semantic representations of bilingual API sequences from big source code data. Experimental results indicate that DeepAM significantly increases the accuracy of API mappings as well as the number of API mappings, when compared with the state-of-the-art approaches.

[1]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[2]  Xiaodong Gu,et al.  Deep API learning , 2016, SIGSOFT FSE.

[3]  Eva Ceulemans,et al.  Proceedings of COMPSTAT'2010 , 2010 .

[4]  Zhi-Hua Zhou,et al.  Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code , 2016, IJCAI.

[5]  Anh Tuan Nguyen,et al.  Statistical learning approach for mining API usage mappings for code migration , 2014, ASE.

[6]  Ali Selamat,et al.  Information and Software Technology , 2014 .

[7]  Maxim Mossienko Automated Cobol to Java recycling , 2003, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings..

[8]  Laurie A. Williams,et al.  Discovering likely mappings between APIs using text mining , 2015, 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM).

[9]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[10]  Jane Cleland-Huang,et al.  Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering , 2016, FSE 2016.

[11]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12]  Wei Chen,et al.  Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework , 2015, AAAI.

[13]  Qing Wang,et al.  Mining API mapping for language migration , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[14]  Dongmei Zhang,et al.  CodeHow: Effective Code Search Based on API Understanding and Extended Boolean Model (E) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[15]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[16]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  Ralf Lämmel,et al.  Swing to SWT and back: Patterns for API migration by wrapping , 2010, 2010 IEEE International Conference on Software Maintenance.

[19]  Richard C. Holt,et al.  A lightweight approach for migrating web frameworks , 2005, Inf. Softw. Technol..

[20]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[21]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[22]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.