Statistical Phrase-Based Translation

We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.

[1]  PietraVincent J. Della,et al.  The mathematics of statistical machine translation , 1993 .

[2]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[3]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[4]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[5]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[6]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[7]  Hermann Ney,et al.  Improved Alignment Models for Statistical Machine Translation , 1999, EMNLP.

[8]  Sabine Schulte im Walde,et al.  Robust German Noun Chunking With a Probabilistic Context-Free Grammar , 2000, COLING.

[9]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[10]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[11]  Hermann Ney,et al.  An Efficient A* Search Algorithm for Statistical Machine Translation , 2001, DDMMT@ACL.

[12]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[13]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[14]  Kenji Imamura,et al.  Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based MT. , 2002, TMI.

[15]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.