Multilingual AMR Parsing with Noisy Knowledge Distillation

We study multilingual AMR parsing from the perspective of knowledge distillation, where the aim is to learn and improve a multilingual AMR parser by using an existing English parser as its teacher. We constrain our exploration in a strict multilingual setting: there is but one model to parse all different languages including English. We identify that noisy input and precise output are the key to successful distillation. Together with extensive pre-training, we obtain an AMR parser whose performances surpass all previously published results on four different foreign languages, including German, Spanish, Italian, and Chinese, by large margins (up to 18.8 SMATCH points on Chinese and on average 11.3 SMATCH points). Our parser also achieves comparable performance on English to the latest state-of-the-art Englishonly parser.

[1]  Yang Gao,et al.  Aligning English Strings with Abstract Meaning Representation Graphs , 2014, EMNLP.

[2]  Guodong Zhou,et al.  Improving AMR Parsing with Sequence-to-Sequence Pre-training , 2020, EMNLP.

[3]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[4]  Yunyao Li,et al.  Towards Universal Semantic Representation , 2019, Proceedings of the First International Workshop on Designing Meaning Representations.

[5]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[6]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[7]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[8]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[9]  Graham Neubig,et al.  Understanding Knowledge Distillation in Non-autoregressive Machine Translation , 2020, ICLR.

[10]  Kevin Knight,et al.  Smatch: an Evaluation Metric for Semantic Feature Structures , 2013, ACL.

[11]  Graham Neubig,et al.  XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization , 2020, ICML.

[12]  Shay B. Cohen,et al.  Cross-Lingual Abstract Meaning Representation Parsing , 2017, NAACL.

[13]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[14]  Yijia Liu,et al.  An AMR Aligner Tuned by Transition-based Parser , 2018, EMNLP.

[15]  Marjan Ghazvininejad,et al.  Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.

[16]  Markus Freitag,et al.  Ensemble Distillation for Neural Machine Translation , 2017, ArXiv.

[17]  Alexander G. Gray,et al.  Question Answering over Knowledge Bases by Leveraging Semantic Parsing and Neuro-Symbolic Reasoning , 2020, ArXiv.

[18]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[19]  Kevin Duh,et al.  Broad-Coverage Semantic Parsing as Transduction , 2019, EMNLP/IJCNLP.

[20]  Xiaochang Peng,et al.  Addressing the Data Sparsity Issue in Neural AMR Parsing , 2017, EACL.

[21]  Jaime G. Carbonell,et al.  A Discriminative Graph-Based Parser for the Abstract Meaning Representation , 2014, ACL.

[22]  Noah A. Smith,et al.  Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser , 2016, EMNLP.

[23]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[24]  Yuqing Tang,et al.  Multilingual Translation with Extensible Multilingual Pretraining and Finetuning , 2020, ArXiv.

[25]  Yue Zhang,et al.  Semantic Neural Machine Translation Using AMR , 2019, TACL.

[26]  Fei Liu,et al.  Abstract Meaning Representation for Multi-Document Summarization , 2018, COLING.

[27]  Liyuan Liu,et al.  On the Variance of the Adaptive Learning Rate and Beyond , 2019, ICLR.

[28]  Michele Bevilacqua,et al.  One SPRING to Rule Them Both: Symmetric AMR Semantic Parsing and Generation without a Complex Pipeline , 2021, AAAI.

[29]  Anette Frank,et al.  Translate, then Parse! A Strong Baseline for Cross-Lingual AMR Parsing , 2021, IWPT.

[30]  Wai Lam,et al.  AMR Parsing via Graph-Sequence Iterative Inference , 2020, ACL.

[31]  Roberto Navigli,et al.  Enabling Cross-Lingual AMR Parsing with Transfer Learning Techniques , 2020, EMNLP.

[32]  Nianwen Xue,et al.  Not an Interlingua, But Close: Comparison of English AMRs to Chinese and Czech , 2014, LREC.

[33]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[34]  Deng Cai,et al.  Core Semantic First: A Top-down Approach for AMR Parsing , 2019, EMNLP.

[35]  Thiago A. S. Pardo,et al.  Towards AMR-BR: A SemBank for Brazilian Portuguese Language , 2018, LREC.

[36]  Holger Schwenk,et al.  Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond , 2018, Transactions of the Association for Computational Linguistics.

[37]  Ondrej Bojar,et al.  Comparing Czech and English AMRs , 2014, LG-LP@COLING.

[38]  Yejin Choi,et al.  Neural AMR: Sequence-to-Sequence Models for Parsing and Generation , 2017, ACL.

[39]  Nan Yang,et al.  Attention-Guided Answer Distillation for Machine Reading Comprehension , 2018, EMNLP.

[40]  Johan Bos,et al.  Neural Semantic Parsing by Character-based Translation: Experiments with Abstract Meaning Representations , 2017, ArXiv.

[41]  Fan Yang,et al.  XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation , 2020, EMNLP.

[42]  Philipp Koehn,et al.  Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[43]  Shoushan Li,et al.  Modeling Source Syntax and Semantics for Neural AMR Parsing , 2019, IJCAI.

[44]  Wai Lam,et al.  Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Science Question Answering , 2021, FINDINGS.

[45]  Guntis Barzdins,et al.  RIGA at SemEval-2016 Task 8: Impact of Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accuracy , 2016, *SEMEVAL.

[46]  Bootstrapping Multilingual AMR with Contextual Word Alignments , 2021, ArXiv.

[47]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[48]  Jörg Tiedemann,et al.  OPUS-MT – Building open translation services for the World , 2020, EAMT.

[49]  Junmo Kim,et al.  A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Mirella Lapata,et al.  Noisy Self-Knowledge Distillation for Text Summarization , 2020, NAACL.

[51]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[52]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[53]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[54]  Kevin Duh,et al.  AMR Parsing as Sequence-to-Graph Transduction , 2019, ACL.

[55]  Di He,et al.  Multilingual Neural Machine Translation with Knowledge Distillation , 2019, ICLR.