Polyglot Semantic Parsing in APIs

Traditional approaches to semantic parsing (SP) work by training individual models for each available parallel dataset of text-meaning pairs. In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages. In particular, we focus on translating text to code signature representations using the software component datasets of Richardson and Kuhn (2017a,b). The advantage of such models is that they can be used for parsing a wide variety of input natural languages and output programming languages, or mixed input languages, using a single unified model. To facilitate modeling of this type, we develop a novel graph-based decoding framework that achieves state-of-the-art performance on the above datasets, and apply this method to two other benchmark SP tasks.

[1]  Satoshi Nakamura,et al.  Incorporating Discrete Translation Lexicons into Neural Machine Translation , 2016, EMNLP.

[2]  Jayant Krishnamurthy,et al.  Neural Semantic Parsing with Type Constraints for Semi-Structured Tables , 2017, EMNLP.

[3]  Donald B. Johnson,et al.  Efficient Algorithms for Shortest Paths in Sparse Networks , 1977, J. ACM.

[4]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[5]  Mans Hulden,et al.  Foma: a Finite-State Compiler and Library , 2009, EACL.

[6]  Premkumar T. Devanbu,et al.  A Survey of Machine Learning for Big Code and Naturalness , 2017, ACM Comput. Surv..

[7]  Liang Huang,et al.  Advanced Dynamic Programming in Semiring and Hypergraph Frameworks , 2008, COLING.

[8]  Mark Johnson,et al.  Semantic Parsing with Bayesian Tree Transducers , 2012, ACL.

[9]  Deniz Yuret,et al.  Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[10]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[11]  Mark Steedman,et al.  Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification , 2010, EMNLP.

[12]  Guillaume Lample,et al.  Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning , 2016, NAACL.

[13]  Jonas Kuhn,et al.  Function Assistant: A Tool for NL Querying of APIs , 2017, EMNLP.

[14]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[15]  Kyle Richardson A Language for Function Signature Representations , 2018, ArXiv.

[16]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[17]  Percy Liang,et al.  Data Recombination for Neural Semantic Parsing , 2016, ACL.

[18]  Luke S. Zettlemoyer,et al.  Learning Context-Dependent Mappings from Sentences to Logical Form , 2009, ACL.

[19]  Jonathan Berant,et al.  Neural Semantic Parsing over Multiple Knowledge-bases , 2017, ACL.

[20]  Jörg Tiedemann,et al.  Neural machine translation for low-resource languages , 2017, ArXiv.

[21]  Mirella Lapata,et al.  Learning an Executable Neural Semantic Parser , 2017, CL.

[22]  Percy Liang,et al.  Lambda Dependency-Based Compositional Semantics , 2013, ArXiv.

[23]  Raymond J. Mooney,et al.  Learning for Semantic Parsing with Statistical Machine Translation , 2006, NAACL.

[24]  Raymond J. Mooney,et al.  Learning for Semantic Parsing , 2009, CICLing.

[25]  Mirella Lapata,et al.  Language to Logical Form with Neural Attention , 2016, ACL.

[26]  Andrew Chou,et al.  Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.

[27]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[28]  Yoav Artzi,et al.  Neural Shift-Reduce CCG Semantic Parsing , 2016, EMNLP.

[29]  Alvin Cheung,et al.  Summarizing Source Code using a Neural Attention Model , 2016, ACL.

[30]  Mehryar Mohri,et al.  On some applications of finite-state automata theory to natural language processing , 1996, Nat. Lang. Eng..

[31]  Mark Johnson,et al.  Reducing Grounded Learning Tasks To Grammatical Inference , 2011, EMNLP.

[32]  Dan Klein,et al.  Abstract Syntax Networks for Code Generation and Semantic Parsing , 2017, ACL.

[33]  Dominique Estival,et al.  Multilingual Semantic Parsing And Code-Switching , 2017, CoNLL.

[34]  Grzegorz Chrupala,et al.  Semantic approaches to software component retrieval with English queries , 2014, LREC.

[35]  Chris Dyer,et al.  Semantic Parsing with Semi-Supervised Sequential Autoencoders , 2016, EMNLP.

[36]  Rico Sennrich,et al.  A Parallel Corpus of Python Functions and Documentation Strings for Automated Code Documentation and Code Generation , 2017, IJCNLP.

[37]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[38]  Kevin Knight,et al.  Decoding Complexity in Word-Replacement Translation Models , 1999, Comput. Linguistics.

[39]  Xiaochang Peng,et al.  Addressing the Data Sparsity Issue in Neural AMR Parsing , 2017, EACL.

[40]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[41]  Jonas Kuhn,et al.  The Code2Text Challenge: Text Generation in Source Libraries , 2017, INLG.

[42]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[43]  Raymond J. Mooney,et al.  Training a Multilingual Sportscaster: Using Perceptual Context to Learn Language , 2014, J. Artif. Intell. Res..

[44]  Ruifang Ge,et al.  Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques , 2010 .

[45]  Mirella Lapata,et al.  Learning Structured Natural Language Representations for Semantic Parsing , 2017, ACL.

[46]  Xiaodong Gu,et al.  Deep API learning , 2016, SIGSOFT FSE.

[47]  Wei Lu,et al.  Multilingual Semantic Parsing : Parsing Multiple Languages into Semantic Representations , 2014, COLING.

[48]  J. Y. Yen,et al.  Finding the K Shortest Loopless Paths in a Network , 2007 .

[49]  M. C. Sinclair,et al.  A Comparative Study of k-Shortest Path Algorithms , 1996 .

[50]  Wei Lu,et al.  Semantic Parsing with Neural Hybrid Trees , 2017, AAAI.

[51]  Raymond J. Mooney,et al.  Learning to sportscast: a test of grounded language acquisition , 2008, ICML '08.

[52]  Jacob Andreas,et al.  Semantic Parsing as Machine Translation , 2013, ACL.

[53]  Yaser Al-Onaizan,et al.  Translation with Finite-State Devices , 1998, AMTA.

[54]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[55]  Jonas Kuhn,et al.  Learning Semantic Correspondences in Technical Documentation , 2017, ACL.

[56]  Kevin Duh,et al.  DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.