论文信息 - mPMR: A Multilingual Pre-trained Machine Reader at Scale - 字舞流文

mPMR: A Multilingual Pre-trained Machine Reader at Scale

We present multilingual Pre-trained Machine Reader (mPMR), a novel method for multilingual machine reading comprehension (MRC)-style pre-training. mPMR aims to guide multilingual pre-trained language models (mPLMs) to perform natural language understanding (NLU) including both sequence classification and span extraction in multiple languages. To achieve cross-lingual generalization when only source-language fine-tuning data is available, existing mPLMs solely transfer NLU capability from a source language to target languages. In contrast, mPMR allows the direct inheritance of multilingual NLU capability from the MRC-style pre-training to downstream tasks. Therefore, mPMR acquires better NLU capability for target languages. mPMR also provides a unified solver for tackling cross-lingual span extraction and sequence classification, thereby enabling the extraction of rationales to explain the sentence-pair classification process.

Wai Lam | Lidong Bing | Weiwen Xu | Xin Li

[1] E. Cambria,et al. Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning , 2023, ArXiv.

[2] E. Cambria,et al. ConNER: Consistency Training for Cross-lingual Named Entity Recognition , 2022, EMNLP.

[3] Jackie Chun-Sing Ho,et al. Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation , 2022, EMNLP.

[4] Xi Victoria Lin,et al. Lifting the Curse of Multilinguality by Pre-training Modular Transformers , 2022, NAACL.

[5] Shafiq R. Joty,et al. Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples , 2021, EMNLP.

[6] Yoshimasa Tsuruoka,et al. mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models , 2021, ACL.

[7] Shafiq R. Joty,et al. GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems , 2021, ACL.

[8] Lidong Bing,et al. Multilingual AMR Parsing with Noisy Knowledge Distillation , 2021, EMNLP.

[9] Weizhu Chen,et al. XLM-K: Improving Cross-Lingual Language Model Pre-Training with Multilingual Knowledge , 2021, AAAI.

[10] Alessandro Raganato,et al. Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks , 2021, NAACL.

[11] Bin Bi,et al. VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation , 2020, ACL.

[12] Wai Lam,et al. Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond , 2020, ArXiv.

[13] Colin Raffel,et al. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2020, NAACL.

[14] Hwee Tou Ng,et al. Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model , 2020, IJCAI.

[15] Jaewoo Kang,et al. Look at the First Sentence: Position Bias in Question Answering , 2020, EMNLP.

[16] Orhan Firat,et al. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization , 2020, ICML.

[17] Eunsol Choi,et al. TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages , 2020, Transactions of the Association for Computational Linguistics.

[18] Wei-Cheng Chang,et al. Pre-training Tasks for Embedding-based Large-scale Retrieval , 2020, ICLR.

[19] Myle Ott,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[20] Luke Zettlemoyer,et al. Emerging Cross-lingual Structure in Pretrained Language Models , 2019, ACL.

[21] Mikel Artetxe,et al. On the Cross-lingual Transferability of Monolingual Representations , 2019, ACL.

[22] Holger Schwenk,et al. MLQA: Evaluating Cross-lingual Extractive Question Answering , 2019, ACL.

[23] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[24] Jason Baldridge,et al. PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification , 2019, EMNLP.

[25] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[26] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.

[27] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[28] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[29] Meng Zhou,et al. From Clozing to Comprehending: Retrofitting Pre-trained Language Model to Pre-trained Machine Reader , 2022, ArXiv.

[30] Meng Zhou,et al. Enhancing Cross-lingual Prompting with Mask Token Augmentation , 2022, ArXiv.

[31] Gábor Berend. Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations , 2023, NAACL.

[32] Heng Ji,et al. Cross-lingual Name Tagging and Linking for 282 Languages , 2017, ACL.