Maximum Entropy Translation Model in Dependency-Based MT Framework

Maximum Entropy Principle has been used successfully in various NLP tasks. In this paper we propose a forward translation model consisting of a set of maximum entropy classifiers: a separate classifier is trained for each (sufficiently frequent) source-side lemma. In this way the estimates of translation probabilities can be sensitive to a large number of features derived from the source sentence (including non-local features, features making use of sentence syntactic structure, etc.). When integrated into English-to-Czech dependency-based translation scenario implemented in the TectoMT framework, the new translation model significantly outperforms the baseline model (MLE) in terms of BLEU. The performance is further boosted in a configuration inspired by Hidden Tree Markov Models which combines the maximum entropy translation model with the target-language dependency tree model.

[1]  Zdenek Zabokrtský,et al.  Hidden Markov Tree Model in Dependency-based Machine Translation , 2009, ACL/IJCNLP.

[2]  P. Sgall,et al.  Generativní popis jazyka a česká deklinace , 1967 .

[3]  Hermann Ney,et al.  Maximum Entropy Models for Named Entity Recognition , 2003, CoNLL.

[4]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[5]  Salim Roukos,et al.  Feature-based language understanding , 1997, EUROSPEECH.

[6]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[7]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[8]  Ondrej Bojar,et al.  CzEng 0.9: Large Parallel Treebank with Rich Annotation , 2009, Prague Bull. Math. Linguistics.

[9]  George F. Foster A Maximum Entropy/Minimum Divergence Translation Model , 2000, ACL.

[10]  Salim Roukos,et al.  Direct Translation Model 2 , 2007, HLT-NAACL.

[11]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[12]  Noah A. Smith,et al.  Feature-Rich Translation by Quasi-Synchronous Lattice Parsing , 2009, EMNLP.

[13]  Jan Hajic Disambiguation of Rich Inflection - Computational Morphology of Czech , 2004 .

[14]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[15]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[16]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[17]  Zdenek Zabokrtský,et al.  Improving English-Czech Tectogrammatical MT , 2009, Prague Bull. Math. Linguistics.

[18]  Petr Pajas,et al.  TectoMT: Highly Modular MT System with Tectogrammatics Used as Transfer Layer , 2008, WMT@ACL.

[19]  Jan Hajič,et al.  The Best of Two Worlds: Cooperation of Statistical and Rule-Based Taggers for Czech , 2007, ACL 2007.