BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer

Adapter modules enable modular and efficient zero-shot cross-lingual transfer, where current state-of-the-art adapter-based approaches learn specialized language adapters (LAs) for individual languages. In this work, we show that it is more effective to learn bilingual language pair adapters (BAs) when the goal is to optimize performance for a particular source-target transfer direction. Our novel BAD-X adapter framework trades off some modularity of dedicated LAs for improved transfer performance: we demonstrate consistent gains in three standard downstream tasks, and for the majority of evaluated low-resource languages.

[1]  Anna Korhonen,et al.  Composable Sparse Fine-Tuning for Cross-Lingual Transfer , 2021, ArXiv.

[2]  Graham Neubig,et al.  Towards a Unified View of Parameter-Efficient Transfer Learning , 2021, ICLR.

[3]  Ngoc Thang Vu,et al.  AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages , 2021, ACL.

[4]  Ahmet Ustun,et al.  Multilingual Unsupervised Neural Machine Translation with Denoising Adapters , 2021, EMNLP.

[5]  Colin Raffel,et al.  mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer , 2020, NAACL.

[6]  Goran Glavas,et al.  MAD-G: Multilingual Adapter Generation for Efficient Cross-Lingual Transfer , 2021, EMNLP.

[7]  Goran Glavas,et al.  Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual Transfer , 2020, ArXiv.

[8]  Matthias Gallé,et al.  Monolingual Adapters for Zero-Shot Neural Machine Translation , 2020, EMNLP.

[9]  Goran Glavaš,et al.  From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers , 2020, EMNLP.

[10]  Iryna Gurevych,et al.  AdapterHub: A Framework for Adapting Transformers , 2020, EMNLP.

[11]  Iryna Gurevych,et al.  MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer , 2020, EMNLP.

[12]  Gertjan van Noord,et al.  UDapter: Language Adaptation for Truly Universal Dependency Parsing , 2020, EMNLP.

[13]  Orhan Firat,et al.  XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization , 2020, ICML.

[14]  Dan Roth,et al.  Cross-Lingual Ability of Multilingual BERT: An Empirical Study , 2019, ICLR.

[15]  Myle Ott,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[16]  Ankur Bapna,et al.  Simple, Scalable Adaptation for Neural Machine Translation , 2019, EMNLP.

[17]  Mark Dredze,et al.  Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT , 2019, EMNLP.

[18]  Mona Attariyan,et al.  Parameter-Efficient Transfer Learning for NLP , 2019, ICML.

[19]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[20]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[21]  Andrea Vedaldi,et al.  Learning multiple visual domains with residual adapters , 2017, NIPS.