论文信息 - Cross-lingual Continual Learning

Cross-lingual Continual Learning

The longstanding goal of multi-lingual learning has been to develop a universal cross-lingual model that can withstand the changes in multi-lingual data distributions. There has been a large amount of work to adapt such multi-lingual models to unseen target languages. However, the majority of work in this direction focuses on the standard one-hop transfer learning pipeline from source to target languages, whereas in realistic scenarios, new languages can be incorporated at any time in a sequential manner. In this paper, we present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm, where we analyze different categories of approaches used to continually adapt to emerging data from different languages. We provide insights into what makes multilingual sequential learning particularly challenging. To surmount such challenges, we benchmark a representative set of cross-lingual continual learning algorithms and analyze their knowledge preservation, accumulation, and generalization capabilities compared to baselines on carefully curated datastreams. The implications of this analysis include a recipe for how to measure and balance different cross-lingual continual learning desiderata, which go beyond conventional transfer learning.

Xiang Ren | Jonathan May | Meryem M'hamdi

[1] T. Tuytelaars,et al. Three types of incremental learning , 2022, Nat. Mac. Intell..

[2] Xi Victoria Lin,et al. Lifting the Curse of Multilinguality by Pre-training Modular Transformers , 2022, NAACL.

[3] Navid Rekabsaz,et al. WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models , 2021, NAACL.

[4] Andrew O. Arnold,et al. Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora , 2021, BIGSCIENCE.

[5] Xiang Ren,et al. X-METRA-ADA: Cross-lingual Meta-Transfer learning Adaptation to Natural Language Understanding and Question Answering , 2021, NAACL.

[6] Orhan Firat,et al. Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution , 2021, NAACL.

[7] Iryna Gurevych,et al. UNKs Everywhere: Adapting Multilingual Language Models to New Scripts , 2020, EMNLP.

[8] Bing Liu,et al. Continual Learning in Task-Oriented Dialogue Systems , 2020, EMNLP.

[9] Magdalena Biesialska,et al. Continual Lifelong Learning in Natural Language Processing: A Survey , 2020, COLING.

[10] Benoit Sagot,et al. When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models , 2020, NAACL.

[11] Haoran Li,et al. MTOP: A Comprehensive Multilingual Task-Oriented Semantic Parsing Benchmark , 2020, EACL.