Improving Machine Translation Quality Estimation with Neural Network Features

Machine translation quality estimation is a challenging task in the WMT evaluation campaign. Feature extraction plays an important role in automatic quality estimation, and in this paper, we propose neural network features, including embedding features and cross-entropy features of source sentences and machine translations, to improve machine translation quality estimation. The sentence embedding features are extracted through global average pooling from word embedding and are trained by the word2vec toolkits, while the sentence crossentropy features are calculated by the recurrent neural network language model. The experimental results on the development set of WMT17 machine translation quality estimation tasks show that the neural network features gain significant improvements over the baseline. Furthermore, when combining the neural network features and the baseline features, the system performance obtains further improvement.

[1]  M. Sasikumar,et al.  Translation Quality Estimation using Recurrent Neural Network , 2016, WMT.

[2]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[3]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[4]  Holger Schwenk,et al.  Continuous space language models , 2007, Comput. Speech Lang..

[5]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[6]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Lucia Specia,et al.  Word embeddings and discourse information for Machine Translation Quality Estimation , 2016 .

[9]  Lucia Specia,et al.  Investigating Continuous Space Language Models for Machine Translation Quality Estimation , 2015, EMNLP.

[10]  Lucia Specia,et al.  SHEF-LIUM-NN: Sentence level Quality Estimation with Neural Network Features , 2016, WMT.

[11]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[12]  Stefan Riezler,et al.  QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation , 2015, WMT@EMNLP.

[13]  Lucia Specia,et al.  QuEst - A translation quality estimation framework , 2013, ACL.

[14]  Lucia Specia,et al.  SHEF-NN: Translation Quality Estimation with Neural Networks , 2015, WMT@EMNLP.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.