Confidence-based Rewriting of Machine Translation Output

Numerous works in Statistical Machine Translation (SMT) have attempted to identify better translation hypotheses obtained by an initial decoding using an improved, but more costly scoring function. In this work, we introduce an approach that takes the hypotheses produced by a state-ofthe-art, reranked phrase-based SMT system, and explores new parts of the search space by applying rewriting rules selected on the basis of posterior phraselevel confidence. In the medical domain, we obtain a 1.9 BLEU improvement over a reranked baseline exploiting the same scoring function, corresponding to a 5.4 BLEU improvement over the original Moses baseline. We show that if an indication of which phrases require rewriting is provided, our automatic rewriting procedure yields an additional improvement of 1.5 BLEU. Various analyses, including a manual error analysis, further illustrate the good performance and potential for improvement of our approach in spite of its simplicity.

[1]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation , 2007, CL.

[2]  Nizar Habash,et al.  Can Automatic Post-Editing Make MT More Meaningful , 2012, EAMT.

[3]  Philipp Koehn,et al.  Monte Carlo inference and maximization for phrase-based translation , 2009, CoNLL.

[4]  Hermann Ney,et al.  N-Gram Posterior Probabilities for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[5]  Ying Zhang,et al.  Distributed Language Modeling for N-best List Re-ranking , 2006, EMNLP.

[6]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[7]  Jan Niehues,et al.  LIMSI $@$ WMT'14 Medical Translation Task , 2014, WMT@ACL.

[8]  Yaser Al-Onaizan,et al.  Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[9]  William J. Byrne,et al.  N-gram posterior probability confidence measures for statistical machine translation: an empirical study , 2012, Machine Translation.

[10]  Chin-Yew Lin,et al.  ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation , 2004, COLING.

[11]  Christof Monz,et al.  Syntactic discriminative language model rerankers for statistical machine translation , 2011, Machine Translation.

[12]  Alexandre Allauzen,et al.  Continuous Space Translation Models with Neural Networks , 2012, NAACL.

[13]  Tiejun Zhao,et al.  Repairing Incorrect Translation with Examples , 2013, IJCNLP.

[14]  Laurent Besacier,et al.  An efficient two-pass decoder for SMT using word confidence estimation , 2014, EAMT.

[15]  Michel Simard,et al.  Statistical Phrase-Based Post-Editing , 2007, NAACL.

[16]  Holger Schwenk,et al.  Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[17]  Quoc-Khanh Do,et al.  Limsi @ Wmt13 , 2013, WMT@ACL.

[18]  Hermann Ney,et al.  Error Analysis of Statistical Machine Translation Output , 2006, LREC.

[19]  Alexandre Allauzen,et al.  Structured Output Layer neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Lucia Specia,et al.  QuEst - A translation quality estimation framework , 2013, ACL.

[21]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[22]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[23]  Alex Kulesza,et al.  Confidence Estimation for Machine Translation , 2004, COLING.

[24]  Hermann Ney,et al.  Comparison of Extended Lexicon Models in Search and Rescoring for SMT , 2009, HLT-NAACL.

[25]  Philippe Langlais,et al.  A greedy decoder for phrase-based statistical machine translation , 2007, TMI.

[26]  Matt Post,et al.  Judging Grammaticality with Tree Substitution Grammar Derivations , 2011, ACL.

[27]  Philipp Koehn,et al.  Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System , 2007, WMT@ACL.

[28]  Jörg Tiedemann,et al.  Document-Wide Decoding for Phrase-Based Statistical Machine Translation , 2012, EMNLP.

[29]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[30]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[31]  François Yvon,et al.  Local lexical adaptation in Machine Translation through triangulation: SMT helping SMT , 2010, COLING.

[32]  William J. Byrne,et al.  Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices , 2010, COLING.

[33]  Chris Callison-Burch,et al.  Incremental Syntactic Language Models for Phrase-based Translation , 2011, ACL.

[34]  Benjamin Marie,et al.  A study in greedy oracle improvement of translation hypotheses , 2013, IWSLT.

[35]  Benjamin Lecouteux,et al.  Word Confidence Estimation for SMT N-best List Re-ranking , 2014, HaCaT@EACL.

[36]  Gregory Shakhnarovich,et al.  A Systematic Exploration of Diversity in Machine Translation , 2013, EMNLP.

[37]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[38]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[39]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[40]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[41]  Nitin Madnani,et al.  Generating targeted paraphrases for improved translation , 2013, TIST.