论文信息 - Memory-augmented Chinese-Uyghur neural machine translation

Memory-augmented Chinese-Uyghur neural machine translation

Neural machine translation (NMT) has achieved notable performance recently. However, this approach has not been widely applied to the translation task between Chinese and Uyghur, partly due to the limited parallel data resource and the large proportion of rare words caused by the agglutinative nature of Uyghur. In this paper, we collect ∼200,000 sentence pairs and show that with this middle-scale database, an attention-based NMT can perform very well on Chinese- Uyghur/Uyghur-Chinese translation. To tackle rare words, we propose a novel memory structure to assist the NMT inference. Our experiments demonstrated that the memory-augmented NMT (M-NMT) outperforms both the vanilla NMT and the phrase-based statistical machine translation (SMT). Interestingly, the memory structure provides an elegant way for dealing with words that are out of vocabulary.

[1] Satoshi Nakamura,et al. Incorporating Discrete Translation Lexicons into Neural Machine Translation , 2016, EMNLP.

[2] Xiao Li,et al. Research for Uyghur-Chinese Neural Machine Translation , 2016, NLPCC/ICCPOL.

[3] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[4] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.

[5] Xiao Li,et al. A Phrase Table Filtering Model Based on Binary Classification for Uyghur-Chinese Machine Translation , 2014, J. Comput..

[6] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[8] Jiajun Zhang,et al. Towards Zero Unknown Word in Neural Machine Translation , 2016, IJCAI.

[9] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10] Jason Weston,et al. Memory Networks , 2014, ICLR.

[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12] Yang Feng,et al. Memory-augmented Neural Machine Translation , 2017, EMNLP.

[13] Lei Wang,et al. Chinese-uyghur statistical machine translation: The initial explorations , 2010, 2010 4th International Universal Communication Symposium.