Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?

The phrase-based and N-gram-based SMT frameworks complement each other. While the former is better able to memorize, the latter provides a more principled model that captures dependencies across phrasal boundaries. Some work has been done to combine insights from these two frameworks. A recent successful attempt showed the advantage of using phrasebased search on top of an N-gram-based model. We probe this question in the reverse direction by investigating whether integrating N-gram-based translation and reordering models into a phrase-based decoder helps overcome the problematic phrasal independence assumption. A large scale evaluation over 8 language pairs shows that performance does significantly improve.

[1]  José B. Mariño,et al.  Analysis and System Combination of Phrase- and N-Gram-Based Statistical Machine Translation Systems , 2007, HLT-NAACL.

[2]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[3]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[4]  Philipp Koehn,et al.  Towards Effective Use of Training Data in Statistical Machine Translation , 2012, WMT@NAACL-HLT.

[5]  José B. Mariño,et al.  Linguistic tuple segmentation in n-gram-based statistical machine translation , 2006, INTERSPEECH.

[6]  José A. R. Fonollosa,et al.  N-Gram-Based Statistical Machine Translation versus Syntax Augmented Machine Translation: Comparison and System Combination , 2009, EACL.

[7]  José B. Mariño,et al.  N-gram-based Machine Translation , 2006, CL.

[8]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[9]  David Chiang,et al.  Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[10]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[11]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12]  José B. Mariño,et al.  Ncode: an Open Source Bilingual N-gram SMT Toolkit , 2011, Prague Bull. Math. Linguistics.

[13]  George F. Foster,et al.  Batch Tuning Strategies for Statistical Machine Translation , 2012, NAACL.

[14]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[15]  François Yvon,et al.  Improving Reordering with Linguistically Informed Bilingual n-grams , 2010, COLING.

[16]  Jan Niehues,et al.  Wider Context by Using Bilingual Language Models in Machine Translation , 2011, WMT@EMNLP.

[17]  José B. Mariño,et al.  Improving statistical MT by coupling reordering and decoding , 2006, Machine Translation.

[18]  Christopher D. Manning,et al.  Accurate Non-Hierarchical Phrase-Based Translation , 2010, NAACL.

[19]  Nadir Durrani,et al.  Model With Minimal Translation Units, But Decode With Phrases , 2013, HLT-NAACL.

[20]  H. Ney,et al.  A Source-side Decoding Sequence Model for Statistical Machine Translation , 2010, AMTA.

[21]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[22]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[23]  Philipp Koehn,et al.  Large and Diverse Language Models for Statistical Machine Translation , 2008, IJCNLP.

[24]  Jianfeng Gao,et al.  Beyond Left-to-Right: Multiple Decomposition Structures for SMT , 2013, HLT-NAACL.

[25]  Chris Quirk,et al.  Machine Translation , 1972, HLT.

[26]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[27]  Philipp Koehn,et al.  Sparse lexicalised features and topic adaptation for SMT , 2012, IWSLT.

[28]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[29]  Christoph Tillmann,et al.  A Unigram Orientation Model for Statistical Machine Translation , 2004, NAACL.

[30]  Nadir Durrani,et al.  A Joint Sequence Translation Model with Integrated Reordering , 2011, ACL.

[31]  François Yvon,et al.  Gappy Translation Units under Left-to-Right SMT Decoding , 2009, EAMT.

[32]  Ashish Vaswani,et al.  Rule Markov Models for Fast Tree-to-String Translation , 2011, ACL.