Maximum Entropy Based Lexical Reordering Model for Hierarchical Phrase-based Machine Translation

The hierarchical phrase-based (HPB) model on the basis of a synchronous context-free grammar (SCFG) is prominent in solving global reorderings. However, the HPB model is inadequate to supervise the reordering process so that sometimes positions of different lexicons are switched due to the incorrect SCFG rules. In this paper, we consider the order of two lexicons as a classification problem and propose a novel lexical reordering model based on a maximum entropy classifier. Our model employs the word alignment and translation during the decoding process. Experimental results on the Chinese-to-English task showed that our method outperformed the baseline system in BLEU score significantly. Moreover, the translation results further proved the effectiveness of our approach.

[1]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[2]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[3]  Fei Xia,et al.  Improving a Statistical MT System with Automatically Learned Rewrite Patterns , 2004, COLING.

[4]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[5]  Hao Yu,et al.  Maximum Entropy Based Phrase Reordering for Hierarchical Phrase-Based Translation , 2010, EMNLP.

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[8]  Andy Way,et al.  The Impact of Source–Side Syntactic Reordering on Hierarchical Phrase-based SMT , 2010, EAMT.

[9]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[10]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[11]  Jason Eisner,et al.  Learning Linear Ordering Problems for Better Translation , 2009, EMNLP.

[12]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[13]  Zhang Le,et al.  Maximum Entropy Modeling Toolkit for Python and C , 2004 .

[14]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[15]  Spyridon Matsoukas,et al.  Effective Use of Linguistic and Contextual Information for Statistical Machine Translation , 2009, EMNLP.

[16]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[17]  Tiejun Zhao,et al.  A Joint Rule Selection Model for Hierarchical Phrase-Based Translation , 2010, ACL.

[18]  Kevin Duh,et al.  Hierarchical Phrase-based Machine Translation with Word-based Reordering Model , 2010, COLING.

[19]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.