Multilingual Neural Semantic Parsing for Low-Resourced Languages

Multilingual semantic parsing is a cost-effective method that allows a single model to understand different languages. However, researchers face a great imbalance of availability of training data, with English being resource rich, and other languages having much less data. To tackle the data limitation problem, we propose using machine translation to bootstrap multilingual training data from the more abundant English data. To compensate for the data quality of machine translated training data, we utilize transfer learning from pretrained multilingual encoders to further improve the model. To evaluate our multilingual models on human-written sentences as opposed to machine translated ones, we introduce a new multilingual semantic parsing dataset in English, Italian and Japanese based on the Facebook Task Oriented Parsing (TOP) dataset. We show that joint multilingual training with pretrained encoders substantially outperforms our baselines on the TOP dataset and outperforms the state-of-the-art model on the public NLMaps dataset. We also establish a new baseline for zero-shot learning on the TOP dataset. We find that a semantic parser trained only on English data achieves a zero-shot performance of 44.9% exact-match accuracy on Italian sentences.

[1]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[2]  Dominique Estival,et al.  Multilingual Semantic Parsing And Code-Switching , 2017, CoNLL.

[3]  Noah A. Smith,et al.  A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[4]  Emilio Monti,et al.  Don’t Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing , 2020, WWW.

[5]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[6]  Wei Lu,et al.  Neural Architectures for Multilingual Semantic Parsing , 2017, ACL.

[7]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[8]  Mirella Lapata,et al.  Bootstrapping a Crosslingual Semantic Parser , 2020, FINDINGS.

[9]  Stefan Riezler,et al.  A Corpus and Semantic Parser for Multilingual Natural Language Querying of OpenStreetMap , 2016, NAACL.

[10]  Graham Neubig,et al.  Stronger Baselines for Trustable Results in Neural Machine Translation , 2017, NMT@ACL.

[11]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[12]  Percy Liang,et al.  Data Recombination for Neural Semantic Parsing , 2016, ACL.

[13]  Yunyao Li,et al.  Generating High Quality Proposition Banks for Multilingual Semantic Role Labeling , 2015, ACL.

[14]  Mark Steedman,et al.  Universal Semantic Parsing , 2017, EMNLP.

[15]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[16]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[17]  Sonal Gupta,et al.  Semantic Parsing for Task Oriented Dialog using Hierarchical Representations , 2018, EMNLP.

[18]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[19]  Johan Bos,et al.  The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations , 2017, EACL.

[20]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[21]  Jonathan Berant,et al.  Building a Semantic Parser Overnight , 2015, ACL.

[22]  Mark Johnson,et al.  Semantic Parsing with Bayesian Tree Transducers , 2012, ACL.

[23]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[24]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[25]  Mirella Lapata,et al.  Coarse-to-Fine Decoding for Neural Semantic Parsing , 2018, ACL.

[26]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[27]  Johan Bos,et al.  Cross-lingual Learning of an Open-domain Semantic Parser , 2016, COLING.

[28]  Shay B. Cohen,et al.  Cross-Lingual Abstract Meaning Representation Parsing , 2017, NAACL.

[29]  Alexander I. Rudnicky,et al.  Expanding the Scope of the ATIS Task: The ATIS-3 Corpus , 1994, HLT.

[30]  Richard Socher,et al.  Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning , 2018, ArXiv.