YANMTT: Yet Another Neural Machine Translation Toolkit

In this paper we present our open-source neural machine translation (NMT) toolkit called “Yet Another Neural Machine Translation Toolkit” abbreviated as YANMTT1 which is built on top of the Transformers library. Despite the growing importance of sequence to sequence pre-training there surprisingly few, if not none, well established toolkits that allow users to easily do pre-training. Toolkits such as Fairseq which do allow pre-training, have very large codebases and thus they are not beginner friendly. With regards to transfer learning via fine-tuning most toolkits do not explicitly allow the user to have control over what parts of the pre-trained models can be transferred. YANMTT aims to address these issues via the minimum amount of code to pre-train large scale NMT models, selectively transfer pre-trained parameters and fine-tune them, perform translation as well as extract representations and attentions for visualization and analyses. Apart from these core features our toolkit also provides other advanced functionalities such as but not limited to document/multisource NMT, simultaneous NMT and model compression via distillation which we believe are relevant to the purpose behind our toolkit.

[1]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[2]  Marjan Ghazvininejad,et al.  Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.

[3]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[4]  Atsushi Fujita,et al.  Recurrent Stacking of Layers for Compact Neural Machine Translation Models , 2018, AAAI.

[5]  Ankur Bapna,et al.  Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges , 2019, ArXiv.

[6]  Sadao Kurohashi,et al.  Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages , 2017, MTSUMMIT.

[7]  Yang Liu,et al.  THUMT: An Open-Source Toolkit for Neural Machine Translation , 2017, AMTA.

[8]  Raj Dabre,et al.  Simultaneous Multi-Pivot Neural Machine Translation , 2021, ArXiv.

[9]  Yu Cheng,et al.  Patient Knowledge Distillation for BERT Model Compression , 2019, EMNLP.

[10]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[11]  Raphael Rubino,et al.  Balancing Cost and Benefit with Tied-Multi Transformers , 2020, NGT.

[12]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[13]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[14]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[17]  Deniz Yuret,et al.  Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[18]  Graham Neubig,et al.  Overview of the 5th Workshop on Asian Translation , 2019, PACLIC.

[19]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[20]  Yonglong Tian,et al.  Contrastive Representation Distillation , 2019, ICLR.

[21]  Haifeng Wang,et al.  STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework , 2018, ACL.

[22]  Matt Post,et al.  A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[23]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[24]  Chenhui Chu,et al.  A Survey of Multilingual Neural Machine Translation , 2019, ACM Comput. Surv..

[25]  Rico Sennrich,et al.  Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.

[26]  Chenhui Chu,et al.  An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation , 2017, ACL.