Hypothesis Mixture Decoding for Statistical Machine Translation

This paper presents hypothesis mixture decoding (HM decoding), a new decoding scheme that performs translation reconstruction using hypotheses generated by multiple translation systems. HM decoding involves two decoding stages: first, each component system decodes independently, with the explored search space kept for use in the next step; second, a new search space is constructed by composing existing hypotheses produced by all component systems using a set of rules provided by the HM decoder itself, and a new set of model independent features are used to seek the final best translation from this new search space. Few assumptions are made by our approach about the underlying component systems, enabling us to leverage SMT models based on arbitrary paradigms. We compare our approach with several related techniques, and demonstrate significant BLEU improvements in large-scale Chinese-to-English translation tasks.

[1]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[2]  John DeNero,et al.  Fast Consensus Decoding over Translation Forests , 2009, ACL.

[3]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[4]  David Chiang,et al.  Learning to Translate with Source and Target Syntax , 2010, ACL.

[5]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Qun Liu,et al.  Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation , 2006, ACL.

[8]  Ming Zhou,et al.  Collaborative Decoding: Partial Hypothesis Re-ranking Using Translation Consensus between Decoders , 2009, ACL/IJCNLP.

[9]  Ming Zhou,et al.  Mixture Model-based Minimum Bayes Risk Decoding using Multiple Machine Translation Systems , 2010, COLING.

[10]  Yang Feng,et al.  Joint Decoding with Multiple Translation Models , 2009, ACL/IJCNLP.

[11]  Shankar Kumar,et al.  Efficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices , 2009, ACL/IJCNLP.

[12]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[13]  Richard M. Schwartz,et al.  Improved Word-Level System Combination for Machine Translation , 2007, ACL.

[14]  Ning Xi,et al.  Incremental HMM Alignment for MT System Combination , 2009, ACL.

[15]  John DeNero,et al.  Model Combination for Machine Translation , 2010, HLT-NAACL.

[16]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[17]  Shankar Kumar,et al.  Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[18]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[19]  Ming Zhou,et al.  Sentence Level Machine Translation Evaluation as a Ranking , 2007, WMT@ACL.

[20]  Tiejun Zhao,et al.  Hybrid Decoding: Decoding with Partial Hypotheses Combination over Multiple SMT Systems , 2010, COLING.

[21]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[22]  Jianfeng Gao,et al.  Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems , 2008, EMNLP.

[23]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.