Measuring Machine Translation Errors in New Domains

We develop two techniques for analyzing the effect of porting a machine translation system to a new domain. One is a macro-level analysis that measures how domain shift affects corpus-level evaluation; the second is a micro-level analysis for word-level errors. We apply these methods to understand what happens when a Parliament-trained phrase-based machine translation system is applied in four very different domains: news, medical texts, scientific articles and movie subtitles. We present quantitative and qualitative experiments that highlight opportunities for future research in domain adaptation for machine translation.

[1]  Roland Kuhn,et al.  Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation , 2010, EMNLP.

[2]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[3]  Andy Way,et al.  Translation Quality-Based Supplementary Data Selection by Incremental Update of Translation Models , 2012, COLING.

[4]  Qun Liu,et al.  Improving Statistical Machine Translation Performance by Training Data Selection and Optimization , 2007, EMNLP-CoNLL.

[5]  Hal Daumé,et al.  Domain Adaptation for Machine Translation by Mining Unseen Words , 2011, ACL.

[6]  Alexandre Allauzen,et al.  Assessing Phrase-Based Translation Models with Oracle Decoding , 2010, EMNLP.

[7]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[8]  Jianfeng Gao,et al.  Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[9]  Guodong Zhou,et al.  Cache-based Document-level Statistical Machine Translation , 2011, EMNLP.

[10]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[11]  Hermann Ney,et al.  Error Analysis of Statistical Machine Translation Output , 2006, LREC.

[12]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[13]  Alex Waibel,et al.  Adaptation of the translation model for statistical machine translation based on information retrieval , 2005, EAMT.

[14]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[15]  Hermann Ney,et al.  Towards Automatic Error Analysis of Machine Translation Output , 2011, CL.

[16]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[17]  Philipp Koehn,et al.  Experiments in Domain Adaptation for Statistical Machine Translation , 2007, WMT@ACL.

[18]  Stephan Vogel,et al.  Language Model Adaptation for Statistical Machine Translation via Structured Query Models , 2004, COLING.

[19]  Philipp Koehn,et al.  Analysing the Effect of Out-of-Domain Data on SMT Systems , 2012, WMT@NAACL-HLT.

[20]  Jörg Tiedemann To Cache or Not To Cache? Experiments with Adaptive Models in Statistical Machine Translation , 2010, WMT@ACL.

[21]  Rachel Rudinger,et al.  SenseSpotting: Never let your parallel data tie you to an old domain , 2013, ACL.

[22]  Hermann Ney,et al.  Combining translation and language model scoring for domain-specific data filtering , 2011, IWSLT.

[23]  Roland Kuhn,et al.  Mixture-Model Adaptation for SMT , 2007, WMT@ACL.

[24]  Kevin Duh,et al.  Analysis of translation model adaptation in statistical machine translation , 2010, IWSLT.

[25]  Michael Collins,et al.  Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation , 2011, EMNLP.

[26]  Philipp Koehn,et al.  Explorer Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation , 2005 .

[27]  Alex Waibel,et al.  Domain Adaptation in Statistical Machine Translation using Factored Translation Models , 2010, EAMT.

[28]  Alexandre Allauzen,et al.  LIMSI’s experiments in domain adaptation for IWSLT11 , 2011, IWSLT.

[29]  Chris Quirk,et al.  Monolingual Marginal Matching for Translation Model Adaptation , 2013, EMNLP.

[30]  Arianna Bisazza,et al.  Fill-up versus interpolation methods for phrase-based SMT adaptation , 2011, IWSLT.

[31]  Yaser Al-Onaizan,et al.  Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[32]  Joe Stringham,et al.  Adaptation in Translation , 1976 .

[33]  Rada Mihalcea,et al.  SemEval-2010 Task 2: Cross-Lingual Lexical Substitution , 2009, SemEval@ACL.

[34]  Alex Kulesza,et al.  Confidence Estimation for Machine Translation , 2004, COLING.

[35]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[36]  Anoop Sarkar,et al.  Mixing Multiple Translation Models in Statistical Machine Translation , 2012, ACL.

[37]  Spyridon Matsoukas,et al.  Discriminative Corpus Weight Estimation for Machine Translation , 2009, EMNLP.

[38]  Holger Schwenk,et al.  Automatic Translation of Scientific Documents in the HAL Archive , 2012, LREC.

[39]  Rohit Prasad,et al.  On-line Language Model Biasing for Statistical Machine Translation , 2011, ACL.

[40]  Rico Sennrich,et al.  Perplexity Minimization for Translation Model Domain Adaptation in Statistical Machine Translation , 2012, EACL.