Translation Prediction with Source Dependency-Based Context Representation

Learning context representations is very promising to improve translation results, particularly through neural networks. Previous efforts process the context words sequentially and neglect their internal syntactic structure. In this paper, we propose a novel neural network based on bi-convolutional architecture to represent the source dependency-based context for translation prediction. The proposed model is able to not only encode the long-distance dependencies but also capture the functional similarities for better translation prediction (i.e., ambiguous words translation and word forms translation). Examined by a largescale Chinese-English translation task, the proposed approach achieves a significant improvement (of up to +1.9 BLEU points) over the baseline system, and meanwhile outperforms a number of context-enhanced comparison system.

[1]  Jianfeng Gao,et al.  Minimum Translation Modeling with Recurrent Neural Networks , 2014, EACL.

[2]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[3]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[4]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[5]  Jan Niehues,et al.  Wider Context by Using Bilingual Language Models in Machine Translation , 2011, WMT@EMNLP.

[6]  Nadir Durrani,et al.  Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT? , 2013, ACL.

[7]  Christof Monz,et al.  Dependency-Based Bilingual Language Models for Reordering in Statistical Machine Translation , 2014, EMNLP.

[8]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9]  Vysoké Učení,et al.  Statistical Language Models Based on Neural Networks , 2012 .

[10]  François Yvon,et al.  Improving Reordering with Linguistically Informed Bilingual n-grams , 2010, COLING.

[11]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[12]  Qun Liu,et al.  Encoding Source Language with Convolutional Neural Network for Machine Translation , 2015, ACL.

[13]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[14]  Colin Cherry An Empirical Evaluation of Noise Contrastive Estimation for the Neural Network Joint Model of Translation , 2016, HLT-NAACL.

[15]  Geoffrey Zweig,et al.  Joint Language and Translation Modeling with Recurrent Neural Networks , 2013, EMNLP.

[16]  Yoshimasa Tsuruoka,et al.  Tree-to-Sequence Attentional Neural Machine Translation , 2016, ACL.

[17]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[18]  Hermann Ney,et al.  A Comparison between Count and Neural Network Models Based on Joint Translation and Reordering Sequences , 2015, EMNLP.

[19]  Lemao Liu,et al.  Neural Machine Translation with Supervised Attention , 2016, COLING.

[20]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[21]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[22]  Ashish Vaswani,et al.  Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.

[23]  Lemao Liu,et al.  Additive Neural Networks for Statistical Machine Translation , 2013, ACL.

[24]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[25]  Jie Hao,et al.  Local Translation Prediction with Global Sentence Representation , 2015, IJCAI.

[26]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[27]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[28]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[29]  Hwee Tou Ng,et al.  Word Sense Disambiguation Improves Statistical Machine Translation , 2007, ACL.

[30]  Nadir Durrani,et al.  A Joint Sequence Translation Model with Integrated Reordering , 2011, ACL.

[31]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[32]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[33]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[34]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[35]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.