Using POS information for statistical machine translation into morphologically rich languages

When translating from languages with hardly any inflectional morphology like English into morphologically rich languages, the English word forms often do not contain enough information for producing the correct fullform in the target language. We investigate methods for improving the quality of such translations by making use of part-of-speech information and maximum entropy modeling. Results for translations from English into Spanish and Catalan are presented on the LC-STAR corpus which consists of spontaneously spoken dialogues in the domain of appointment scheduling and travel planning.

[1]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[2]  John D. Lafferty,et al.  Analysis, statistical transfer, and synthesis in machine translation , 1992, TMI.

[3]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[4]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[5]  Alexander H. Waibel,et al.  Decoding Algorithm in Statistical Machine Translation , 1997, ACL.

[6]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Adwait Ratnaparkhi,et al.  A Simple Introduction to Maximum Entropy Models for Natural Language Processing , 1997 .

[8]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[9]  Hermann Ney,et al.  An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.

[10]  Hermann Ney,et al.  Statistical Methods for Machine Translation , 2000 .

[11]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[12]  Hermann Ney,et al.  Word Re-ordering and DP-based Search in Statistical Machine Translation , 2000, COLING.

[13]  Hermann Ney,et al.  Morpho-syntactic analysis for reordering in statistical machine translation , 2001, MTSUMMIT.

[14]  Hermann Ney,et al.  Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach , 2001, ACL.

[15]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[16]  Hermann Ney,et al.  Toward hierarchical models for statistical machine translation of inflected languages , 2001, DDMMT@ACL.

[17]  Christopher D. Manning,et al.  Extentions to HMM-based Statistical Word Alignment Models , 2002, EMNLP.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.