Segment Choice Models: Feature-Rich Models for Global Distortion in Statistical Machine Translation

This paper presents a new approach to distortion (phrase reordering) in phrase-based machine translation (MT). Distortion is modeled as a sequence of choices during translation. The approach yields trainable, probabilistic distortion models that are global: they assign a probability to each possible phrase reordering. These "segment choice" models (SCMs) can be trained on "segment-aligned" sentence pairs; they can be applied during decoding or rescoring. The approach yields a metric called "distortion perplexity" ("disperp") for comparing SCMs offline on test data, analogous to perplexity for language models. A decision-tree-based SCM is tested on Chinese-to-English translation, and outperforms a baseline distortion penalty approach at the 99% confidence level.

[1]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[2]  Tong Zhang,et al.  A Localized Prediction Model for Statistical Machine Translation , 2005, ACL.

[3]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[4]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[5]  Roland Kuhn,et al.  PORTAGE: with Smoothed Phrase Tables and Segment Choice Models , 2006, WMT@HLT-NAACL.

[6]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[7]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[8]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[9]  Roland Kuhn,et al.  Improving decision trees for acoustic modeling , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[10]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[11]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[12]  Christoph Tillmann,et al.  A Unigram Orientation Model for Statistical Machine Translation , 2004, NAACL.

[13]  Shankar Kumar,et al.  Local Phrase Reordering Models for Statistical Machine Translation , 2005, HLT.

[14]  J. Cleary,et al.  \self-organized Language Modeling for Speech Recognition". In , 1997 .

[15]  Edward J. Delp,et al.  An iterative growing and pruning algorithm for classification tree design , 1989, Conference Proceedings., IEEE International Conference on Systems, Man and Cybernetics.