Positive Diversity Tuning for Machine Translation System Combination

We present Positive Diversity Tuning, a new method for tuning machine translation models specifically for improved performance during system combination. System combination gains are often limited by the fact that the translations produced by the different component systems are too similar to each other. We propose a method for reducing excess cross-system similarity by optimizing a joint objective that simultaneously rewards models for producing translations that are similar to reference translations, while also punishing them for translations that are too similar to those produced by other systems. The formulation of the Positive Diversity objective is easy to implement and allows for its quick integration with most machine translation tuning pipelines. We find that individual systems tuned on the same data to Positive Diversity can be even more diverse than systems built using different data sets, while still obtaining good BLEU scores. When these individual systems are used together for system combination, our approach allows for significant gains of 0.8 BLEU even when the combination is performed using a small number of otherwise identical individual systems.

[1]  Christopher D. Manning,et al.  Fast and Adaptive Online Training of Feature-Rich Translation Models , 2013, ACL.

[2]  John DeNero,et al.  Model Combination for Machine Translation , 2010, HLT-NAACL.

[3]  Jingbo Zhu,et al.  Bagging and Boosting statistical machine translation systems , 2013, Artif. Intell..

[4]  Alon Lavie,et al.  Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.

[5]  Nitin Madnani,et al.  Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric , 2009, WMT@EACL.

[6]  Mark J. F. Gales,et al.  Complementary System Generation using Directed Decision Trees , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  Slav Petrov,et al.  Training Structured Prediction Models with Extrinsic Loss Functions , 2011 .

[8]  Mark Hopkins,et al.  Tuning as Ranking , 2011, EMNLP.

[9]  Tadashi Nomoto Multi-Engine Machine Translation with Voted Language Model , 2004, ACL.

[10]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[11]  Takako Aikawa,et al.  Chained System: A Linear Combination of Different Types of Statistical Machine Translation Systems , 2009, MTSUMMIT.

[12]  Eiichiro Sumita,et al.  Nobody is perfect: ATR’s hybrid approach to spoken language translation , 2005, IWSLT.

[13]  Jonathan G. Fiscus,et al.  A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[14]  Giuseppe Riccardi,et al.  Computing consensus translation from multiple machine translation systems , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[15]  H. Ney,et al.  A novel string-to-string distance measure with applications to machine translation evaluation , 2003, MTSUMMIT.

[16]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[17]  Fei Huang,et al.  Hierarchical System Combination for Machine Translation , 2007, EMNLP.

[18]  Alon Lavie,et al.  CMU Multi-Engine Machine Translation for WMT 2010 , 2010, WMT@ACL.

[19]  Bowen Zhou,et al.  Enlisting the Ghost: Modeling Empty Categories for Machine Translation , 2013, ACL.

[20]  Noah A. Smith,et al.  Structured Ramp Loss Minimization for Machine Translation , 2012, HLT-NAACL.

[21]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[22]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[23]  Hermann Ney,et al.  Computing Consensus Translation for Multiple Machine Translation Systems Using Enhanced Hypothesis Alignment , 2006, EACL.

[24]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[25]  Yong Zhao,et al.  Using N-gram based Features for Machine Translation System Combination , 2009, HLT-NAACL.

[26]  Richard M. Schwartz,et al.  Combining Outputs from Multiple Machine Translation Systems , 2007, NAACL.

[27]  Mark Dras,et al.  Choosing the Right Translation: A Syntactically Informed Classification Approach , 2008, COLING.

[28]  Daniel Jurafsky,et al.  Phrasal: A Statistical Machine Translation Toolkit for Exploring New Model Features , 2010, NAACL.

[29]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[30]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[31]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[32]  Sanjeev Khudanpur,et al.  Machine Translation System Combination using ITG-based Alignments , 2008, ACL.

[33]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[34]  Wolfgang Macherey,et al.  An Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems , 2007, EMNLP.

[35]  Alon Lavie,et al.  Voting on N-grams for Machine Translation System Combination , 2010, AMTA.

[36]  Jaime Carbonell,et al.  Multi-Document Summarization By Sentence Extraction , 2000 .

[37]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[38]  Richard M. Schwartz,et al.  Improved Word-Level System Combination for Machine Translation , 2007, ACL.