Phrasetable Smoothing for Statistical Machine Translation

We discuss different strategies for smoothing the phrasetable in Statistical MT, and give results over a range of translation settings. We show that any type of smoothing is a better idea than the relative-frequency estimates that are often used. The best smoothing techniques yield consistent gains of approximately 1% (absolute) according to the BLEU metric.

[1]  Arthur Nádas,et al.  On Turing's formula for word probabilities , 1985, IEEE Trans. Acoust. Speech Signal Process..

[2]  Roger K. Moore Computer Speech and Language , 1986 .

[3]  Kenneth Ward Church,et al.  A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams , 1991 .

[4]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[5]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[6]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[7]  Michael Collins,et al.  Prepositional Phrase Attachment through a Backed-off Model , 1995, VLC@ACL.

[8]  Hermann Ney,et al.  An iterative, DP-based search algorithm for statistical machine translation , 1998, ICSLP.

[9]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[10]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[11]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[12]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[13]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[14]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[15]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[16]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[17]  EstimationPeter,et al.  The Mathematics of Machine Translation : Parameter , 2004 .

[18]  Hermann Ney,et al.  Improvements in Phrase-Based Statistical Machine Translation , 2004, NAACL.

[19]  Robert C. Moore Improving IBM Word Alignment Model 1 , 2004, ACL.

[20]  Philipp Koehn,et al.  Edinburgh System Descriptionfor the 2005 NIST MT Evaluation , 2005 .

[21]  Philip Resnik,et al.  Proceedings of the ACL Workshop on Building and Using Parallel Texts , 2005 .

[22]  A Look inside the ITC-irst SMT System , 2005, MTSUMMIT.

[23]  Philippe Langlais,et al.  RALI: SMT Shared Task System Description , 2005, ParallelText@ACL.