论文信息 - Source reordering using MaxEnt classifiers and supertags

Source reordering using MaxEnt classifiers and supertags

Source language reordering can be seen as the preprocessing task of permuting the order of the source words in such a way that the resulting permutation allows as monotone a translation process as possible. We explore a simple but effective source reordering algorithm that works as a cascade of source string transforms, each consisting of swapping the positions of a single pair of adjacent words in order to unfold a candidate pair of crossing alignments. The decision to swap a pair of words is modelled as a binary classification task formulated as a log-linear model and trained under maximum entropy (MaxEnt). We experiment with features that consist of the local neighborhood of both words as well as lexico-syntactic representations known as supertags. Our experiments on the English-to-Dutch EuroParl translation task show that the cascaded alignment unfolding slightly improves the performance of a state-of-the-art phrase translation system that uses distance-based and lexicalized block-oriented reordering.

Khalil Sima'an | Maxim Khalilov

[1] Fei Xia,et al. Improving a Statistical MT System with Automatically Learned Rewrite Patterns , 2004, COLING.

[2] Hermann Ney,et al. The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[3] Andy Way,et al. Supertagged Phrase-Based Statistical Machine Translation , 2007, ACL.

[4] Christoph Tillmann,et al. A Unigram Orientation Model for Statistical Machine Translation , 2004, NAACL.

[5] Dekai Wu,et al. Machine Translation with a Stochastic Grammatical Channel , 1998, COLING-ACL.

[6] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[7] Maxim Khalilov,et al. New Statistical And Syntactic Models For Machine Translation , 2010 .

[8] Jason Eisner,et al. Learning Linear Ordering Problems for Better Translation , 2009, EMNLP.

[9] David Chiang,et al. A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[10] Hermann Ney,et al. Phrase-Based Statistical Machine Translation , 2002, KI.

[11] Chao Wang,et al. Chinese Syntactic Reordering for Statistical Machine Translation , 2007, EMNLP.