HW-TSC's Participation in the WMT 2020 News Translation Shared Task

This paper presents our work in the WMT 2020 News Translation Shared Task. We participate in 3 language pairs including Zh/En, Km/En, and Ps/En and in both directions under the constrained condition. We use the standard Transformer-Big model as the baseline and obtain the best performance via two variants with larger parameter sizes. We perform detailed pre-processing and filtering on the provided large-scale bilingual and monolingual dataset. Several commonly used strategies are used to train our models such as Back Translation, Ensemble Knowledge Distillation, etc. We also conduct experiment with similar language augmentation, which lead to positive results, although not used in our submission. Our submission obtains competitive results in the final evaluation.

[1]  Matthijs Douze,et al.  FastText.zip: Compressing text classification models , 2016, ArXiv.

[2]  Jingbo Zhu,et al.  The NiuTrans Machine Translation Systems for WMT19 , 2019, WMT.

[3]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[4]  Ankur Bapna,et al.  Investigating Multilingual NMT Representations at Scale , 2019, EMNLP.

[5]  Yang Liu,et al.  THUMT: An Open-Source Toolkit for Neural Machine Translation , 2017, AMTA.

[6]  Taku Kudo,et al.  Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates , 2018, ACL.

[7]  Myle Ott,et al.  Facebook FAIR’s WMT19 News Translation Task Submission , 2019, WMT.

[8]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[9]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[10]  Markus Freitag,et al.  Ensemble Distillation for Neural Machine Translation , 2017, ArXiv.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Noah A. Smith,et al.  A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[13]  Myle Ott,et al.  Understanding Back-Translation at Scale , 2018, EMNLP.

[14]  Meng Sun,et al.  Baidu Neural Machine Translation Systems for WMT19 , 2019, WMT.

[15]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[16]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[17]  Tie-Yan Liu,et al.  Exploiting Monolingual Data at Scale for Neural Machine Translation , 2019, EMNLP.