暂无分享,去创建一个
[1] Yike Guo,et al. Regularizing Deep Multi-Task Networks using Orthogonal Gradients , 2019, ArXiv.
[2] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.
[3] Rico Sennrich,et al. Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation , 2020, ACL.
[4] Feifei Zhai,et al. A Compact and Language-Sensitive Multilingual Translation Method , 2019, ACL.
[5] Pushpak Bhattacharyya,et al. Multilingual Unsupervised NMT using Shared Encoder and Language-Specific Decoders , 2019, ACL.
[6] Graham Neubig,et al. Balancing Training for Multilingual Neural Machine Translation , 2020, ACL.
[7] Graham Neubig,et al. Choosing Transfer Languages for Cross-Lingual Learning , 2019, ACL.
[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[9] Graham Neubig,et al. Parameter Sharing Methods for Multilingual Self-Attentional Translation Models , 2018, WMT.
[10] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[11] Kevin Knight,et al. Multi-Source Neural Translation , 2016, NAACL.
[12] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[13] Marjan Ghazvininejad,et al. Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.
[14] Orhan Firat,et al. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding , 2020, ICLR.
[15] Adithya Renduchintala,et al. Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders , 2022, EACL.
[16] André F. T. Martins,et al. Adaptively Sparse Transformers , 2019, EMNLP.
[17] Xian Li,et al. Deep Transformers with Latent Depth , 2020, NeurIPS.
[18] Juan Pino,et al. Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling , 2021, NeurIPS.
[19] Ankur Bapna,et al. Simple, Scalable Adaptation for Neural Machine Translation , 2019, EMNLP.
[20] Sarah L. Nesbeitt. Ethnologue: Languages of the World , 1999 .
[21] Yulia Tsvetkov,et al. Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models , 2020, ICLR.
[22] Ankur Bapna,et al. Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation , 2021, ICLR.
[23] Tao Qin,et al. Multilingual Neural Machine Translation with Language Clustering , 2019, EMNLP.
[24] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[25] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[26] Noam Shazeer,et al. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity , 2021, ArXiv.
[27] Graham Neubig,et al. When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation? , 2018, NAACL.
[28] Ankur Bapna,et al. Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges , 2019, ArXiv.
[29] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[30] Matthias Gallé,et al. Language Adapters for Zero Shot Neural Machine Translation , 2020, EMNLP.
[31] Chenhui Chu,et al. A Survey of Multilingual Neural Machine Translation , 2019, ACM Comput. Surv..