SimpleNets: Quality Estimation with Resource-Light Neural Networks

We introduce SimpleNets: a resource-light solution to the sentence-level Quality Estimation task of WMT16 that combines Recurrent Neural Networks, word embedding models, and the principle of compositionality. The SimpleNets systems explore the idea that the quality of a translation can be derived from the quality of its n-grams. This approach has been successfully employed in Text Simplification quality assessment in the past. Our experiments show that, surprisingly, our models can learn more about a translation’s quality by focusing on the original sentence, rather than on the translation itself.

[1]  Philipp Koehn,et al.  Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[2]  Lucia Specia,et al.  Multi-level Translation Quality Prediction with QuEst++ , 2015, ACL.

[3]  SpeciaLucia,et al.  Machine translation evaluation versus quality estimation , 2010 .

[4]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[5]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Radu Soricut,et al.  TrustRank: Inducing Trust in Automatic Translations via Ranking , 2010, ACL.

[8]  Christopher D. Manning,et al.  Bilingual Word Representations with Monolingual Quality in Mind , 2015, VS@HLT-NAACL.

[9]  David Kauchak,et al.  Improving Text Simplification Language Modeling Using Unsimplified Text Data , 2013, ACL.

[10]  Marc Brysbaert,et al.  Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English , 2009, Behavior research methods.

[11]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[12]  Lucia Specia,et al.  Quality estimation for translation selection , 2014, EAMT.

[13]  Lucia Specia,et al.  Unsupervised Lexical Simplification for Non-Native Speakers , 2016, AAAI.

[14]  Philipp Koehn,et al.  Findings of the 2015 Workshop on Statistical Machine Translation , 2015, WMT@EMNLP.

[15]  Lucia Specia,et al.  Exploiting Objective Annotations for Minimising Translation Post-editing Effort , 2011, EAMT.

[16]  Lucia Specia,et al.  Machine translation evaluation versus quality estimation , 2010, Machine Translation.

[17]  Lucia Specia,et al.  Investigating Continuous Space Language Models for Machine Translation Quality Estimation , 2015, EMNLP.

[18]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.