Continuous Space Translation Models for Phrase-Based Statistical Machine Translation

This paper presents a new approach to perform the estimation of the translation model probabilities of a phrase-based statistical machine translation system. We use neural networks to directly learn the translation probability of phrase pairs using continuous representations. The system can be easily trained on the same data used to build standard phrase-based systems. We provide experimental evidence that the approach seems to be able to infer meaningful translation probabilities for phrase pairs not seen in the training data, or even predict a list of the most likely translations given a source phrase. The approach can be used to rescore n-best lists, but we also discuss an integration into the Moses decoder. A preliminary evaluation on the English/French IWSLT task achieved improvements in the BLEU score and a human analysis showed that the new model often chooses semantically better translations. Several extensions of this work are discussed.

[1]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[2]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[3]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[4]  Roland Kuhn,et al.  Phrasetable Smoothing for Statistical Machine Translation , 2006, EMNLP.

[5]  Holger Schwenk,et al.  Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[6]  José A. R. Fonollosa,et al.  Smooth Bilingual N-Gram Translation , 2007, EMNLP.

[7]  Holger Schwenk,et al.  Continuous space language models , 2007, Comput. Speech Lang..

[8]  Thorsten Brants,et al.  Large Language Models in Machine Translation , 2007, EMNLP.

[9]  N. Calzolari,et al.  The Prague Bulletin of Mathematical Linguistics , 2009 .

[10]  Alexandre Allauzen,et al.  Training Continuous Space Language Models: Some Practical Issues , 2010, EMNLP.

[11]  Holger Schwenk,et al.  N-gram-based machine translation enhanced with neural networks , 2010, IWSLT.

[12]  Alexandre Allauzen,et al.  Structured Output Layer neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Paul Deléglise,et al.  LIUM's systems for the IWSLT 2011 speech translation tasks , 2011, IWSLT.

[14]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[15]  Holger Schwenk,et al.  Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation , 2012, WLM@NAACL-HLT.