DiDi's Machine Translation System for WMT2020

This paper describes DiDi AI Labs' submission to the WMT2020 news translation shared task. We participate in the translation direction of Chinese->English. In this direction, we use the Transformer as our baseline model, and integrate several techniques for model enhancement, including data filtering, data selection, back-translation, fine-tuning, model ensembling, and re-ranking. As a result, our submission achieves a BLEU score of $36.6$ in Chinese->English.

[1]  Li Ma,et al.  Kingsoft's Neural Machine Translation System for WMT19 , 2019, WMT.

[2]  Markus Freitag,et al.  Ensemble Distillation for Neural Machine Translation , 2017, ArXiv.

[3]  Noah A. Smith,et al.  A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[6]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[7]  Myle Ott,et al.  Understanding Back-Translation at Scale , 2018, EMNLP.

[8]  Jianfeng Gao,et al.  Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[9]  Lijun Wu,et al.  Microsoft Research Asia’s Systems for WMT19 , 2019, WMT.

[10]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[11]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[12]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[13]  Boliang Zhang,et al.  Parallel Corpus Filtering via Pre-trained Language Models , 2020, ACL.

[14]  Meng Sun,et al.  Baidu Neural Machine Translation Systems for WMT19 , 2019, WMT.

[15]  Ashish Vaswani,et al.  Self-Attention with Relative Position Representations , 2018, NAACL.

[16]  Christopher D. Manning,et al.  Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.

[17]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[19]  Jingbo Zhu,et al.  The NiuTrans Machine Translation Systems for WMT19 , 2019, WMT.

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  Enhong Chen,et al.  Joint Training for Neural Machine Translation Models with Monolingual Data , 2018, AAAI.

[22]  Kevin Duh,et al.  Adaptation Data Selection using Neural Language Models: Experiments in Machine Translation , 2013, ACL.

[23]  Matt Post,et al.  A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[24]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.