Structural Feature Selection For English-Korean Statistical Machine Translation

When aligning texts in very different languages such as Korean and English, structural features beyond word or phrase give useful information. In this paper, we present a method for selecting structural features of two languages, from which we construct a model that assigns the conditional probabilities to corresponding tag sequences in bilingual English-Korean corpora. For tag sequence mapping between two languages, we first define a structural feature function which represents statistical properties of empirical distribution of a set of training samples. The system, based on maximum entropy concept, selects only features that produce high increases in loglikelihood of training samples. These structurally mapped features are more informative knowledge for statistical machine translation between English and Korean. Also, the information can help to reduce the parameter space of statistical alignment by eliminating syntactically unlikely alignments.

[1]  Hermann Ney,et al.  A DP based Search Using Monotone Alignments in Statistical Translation , 1997, ACL.

[2]  Stanley F. Chen,et al.  Aligning Sentences in Bilingual Corpora Using Lexical Information , 1993, ACL.

[3]  Franz Josef Och,et al.  Improving Statistical Natural Language Translation with Categories and Rules , 1998, ACL.

[4]  Yuji Matsumoto,et al.  Sructural Matching of Parallel Texts , 1993, ACL.

[5]  Key-Sun Choi,et al.  Bilingual Knowledge Acquisition from Korean-English Parallel Corpus Using Alignment , 1996, COLING.

[6]  Martin Kay,et al.  Text-Translation Alignment , 1993, Comput. Linguistics.

[7]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[8]  Vasileios Hatzivassiloglou,et al.  Translating Collocations for Bilingual Lexicons: A Statistical Approach , 1996, CL.

[9]  Dekai Wu,et al.  A Polynomial-Time Algorithm for Statistical Machine Translation , 1996, ACL.

[10]  Julian Kupiec,et al.  An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora , 1993, ACL.

[11]  John D. Lafferty,et al.  The Candide System for Machine Translation , 1994, HLT.

[12]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[13]  Alexander H. Waibel,et al.  Decoding Algorithm in Statistical Machine Translation , 1997, ACL.

[14]  Masakazu Nakanishi,et al.  Maximum Entropy Model Learning of the Translation Rules , 1998, COLING-ACL.

[15]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[18]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  I. Dan Melamed A Word-to-Word Model of Translational Equivalence , 1997, ACL.

[20]  Alexander H. Waibel,et al.  Modeling with Structures in Statistical Machine translation , 1998, ACL.

[21]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.