论文信息 - Encoding Source Language with Convolutional Neural Network for Machine Translation - 字舞流文

Encoding Source Language with Convolutional Neural Network for Machine Translation

The recently proposed neural network joint model (NNJM) (Devlin et al., 2014) augments the n-gram target language model with a heuristically chosen source context window, achieving state-of-the-art performance in SMT. In this paper, we give a more systematic treatment by summarizing the relevant source information through a convolutional architecture guided by the target information. With different guiding signals during decoding, our specifically designed convolution+gating architectures can pinpoint the parts of a source sentence that are relevant to predicting a target word, and fuse them with the context of entire source sentence to form a unified representation. This representation, together with target language words, are fed to a deep neural network (DNN) to form a stronger NNJM. Experiments on two NIST Chinese-English translation tasks show that the proposed model can achieve significant improvements over the previous NNJM by up to +1.08 BLEU points on average

Qun Liu | Hang Li | Fandong Meng | Zhengdong Lu | Wenbin Jiang | Mingxuan Wang | Zhengdong Lu | Hang Li | Qun Liu | Fandong Meng | Mingxuan Wang | Wenbin Jiang

[1] Philipp Koehn,et al. Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[2] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[3] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.

[4] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[5] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[6] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[7] Hang Li,et al. Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[8] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[9] Qun Liu,et al. A novel dependency-to-string model for statistical machine translation , 2011, EMNLP.

[10] Jinxi Xu,et al. A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[11] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[13] Qun Liu,et al. Translation with Source Constituency and Dependency Trees , 2013, EMNLP.

[14] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[15] David Chiang,et al. Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[16] Richard M. Schwartz,et al. Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[17] Daniel Marcu,et al. What’s in a translation rule? , 2004, NAACL.

[18] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[19] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[20] Hermann Ney,et al. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[21] Geoffrey Zweig,et al. Joint Language and Translation Modeling with Recurrent Neural Networks , 2013, EMNLP.

[22] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.

[23] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[24] Dan Klein,et al. Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.