Sogou Neural Machine Translation Systems for WMT17

We describe the Sogou neural machine translation systems for the WMT 2017 Chinese↔English news translation tasks. Our systems are based on a multilayer encoder-decoder architecture with attention mechanism. The best translation is obtained with ensemble and reranking techniques. We also propose an approach to improve the named entity translation problem. Our Chinese→English system achieved the highest cased BLEU among all 20 submitted systems, and our English→Chinese system ranked the third out of 16 submitted systems.

[1]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[2]  Lemao Liu,et al.  Agreement on Target-bidirectional Neural Machine Translation , 2016, NAACL.

[3]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[4]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[5]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[6]  Wei Xu,et al.  Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation , 2016, TACL.

[7]  Jiajun Zhang,et al.  Bridging Neural Machine Translation and Bilingual Dictionaries , 2016, ArXiv.

[8]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[9]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[10]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[11]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[12]  Jiajun Zhang,et al.  Neural Name Translation Improves Neural Machine Translation , 2016, Communications in Computer and Information Science.

[13]  Bo Wang,et al.  SYSTRAN's Pure Neural Machine Translation Systems , 2016, ArXiv.

[14]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[15]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[16]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.

[17]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[20]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[21]  Yann LeCun,et al.  Orthogonal RNNs and Long-Memory Tasks , 2016, ArXiv.