Model Combination for Machine Translation

Machine translation benefits from two types of decoding techniques: consensus decoding over multiple hypotheses under a single model and system combination over hypotheses from different models. We present model combination, a method that integrates consensus decoding and system combination into a unified, forest-based technique. Our approach makes few assumptions about the underlying component models, enabling us to combine systems with heterogenous structure. Unlike most system combination techniques, we reuse the search space of component models, which entirely avoids the need to align translation hypotheses. Despite its relative simplicity, model combination improves translation quality over a pipelined approach of first applying consensus decoding to individual systems, and then applying system combination to their output. We demonstrate BLEU improvements across data sets and language pairs in large-scale experiments.

[1]  Yang Feng,et al.  Joint Decoding with Multiple Translation Models , 2009, ACL/IJCNLP.

[2]  Shankar Kumar,et al.  Efficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices , 2009, ACL/IJCNLP.

[3]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[4]  Mikko Kurimo,et al.  Minimum Bayes Risk Combination of Translation Hypotheses from Alternative Morphological Decompositions , 2009, NAACL.

[5]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[6]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[7]  Kristina Toutanova,et al.  Joint Optimization for Machine Translation System Combination , 2009, EMNLP.

[8]  Richard M. Schwartz,et al.  Combining Outputs from Multiple Machine Translation Systems , 2007, NAACL.

[9]  Yong Zhao,et al.  Using N-gram based Features for Machine Translation System Combination , 2009, HLT-NAACL.

[10]  Ming Zhou,et al.  Collaborative Decoding: Partial Hypothesis Re-ranking Using Translation Consensus between Decoders , 2009, ACL/IJCNLP.

[11]  Andreas Stolcke,et al.  Efficient lattice representation and generation , 1998, ICSLP.

[12]  Zhifei Li,et al.  First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests , 2009, EMNLP.

[13]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[14]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[15]  Shankar Kumar,et al.  Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[16]  Sergei Nirenburg,et al.  Three Heads are Better than One , 1994, ANLP.

[17]  Jianfeng Gao,et al.  Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems , 2008, EMNLP.

[18]  Sanjeev Khudanpur,et al.  Variational Decoding for Statistical Machine Translation , 2009, ACL.

[19]  John DeNero,et al.  Fast Consensus Decoding over Translation Forests , 2009, ACL.

[20]  Sanjeev Khudanpur,et al.  A Scalable Decoder for Parsing-Based Machine Translation with Equivalent Language Model State Maintenance , 2008, SSST@ACL.

[21]  Wolfgang Macherey,et al.  An Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems , 2007, EMNLP.