Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach

Typically, the lexicon models used in statistical machine translation systems do not include any kind of linguistic or contextual information, which often leads to problems in performing a correct word sense disambiguation. One way to deal with this problem within the statistical framework is to use maximum entropy methods. In this paper, we present how to use this type of information within a statistical machine translation system. We show that it is possible to significantly decrease training and test corpus perplexity of the translation models. In addition, we perform a rescoring of N-Best lists using our maximum entropy model and thereby yield an improvement in translation quality. Experimental results are presented on the so-called "Verbmobil Task".

[1]  Hermann Ney,et al.  Smoothing methods in maximum entropy language modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[3]  Hermann Ney,et al.  Assessment of smoothing methods and complex stochastic language modeling , 1999, EUROSPEECH.

[4]  Ronald Rosenfeld,et al.  A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..

[5]  Salim Roukos,et al.  Maximum likelihood and discriminative training of direct translation models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  John D. Lafferty,et al.  The Candide System for Machine Translation , 1994, HLT.

[7]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[8]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Dietrich Klakow,et al.  COMPACT MAXIMUM ENTROPY LANGUAGE MODELS , 1999 .

[10]  Hermann Ney,et al.  A DP based Search Algorithm for Statistical Machine Translation , 1998, ACL.

[11]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[12]  Hermann Ney,et al.  Word Re-ordering and DP-based Search in Statistical Machine Translation , 2000, COLING.

[13]  Franz Josef Och,et al.  An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[14]  George F. Foster A Maximum Entropy/Minimum Divergence Translation Model , 2000, ACL.

[15]  Alexander H. Waibel,et al.  Decoding Algorithm in Statistical Machine Translation , 1997, ACL.

[16]  Salim Roukos,et al.  Feature-based language understanding , 1997, EUROSPEECH.

[17]  George F. Foster Incorporating Position Information into a Maximum Entropy/Minimum Divergence Translation Model , 2000, CoNLL/LLL.

[18]  Hermann Ney,et al.  A DP based Search Using Monotone Alignments in Statistical Translation , 1997, ACL.

[19]  David Yarowsky,et al.  Statistical Machine Translation: Final Report , 1999 .