Better statistical estimation can benefit all phrases in phrase-based statistical machine translation

The heuristic estimates of conditional phrase translation probabilities are based on frequency counts in a word-aligned parallel corpus. Earlier attempts at more principled estimation using Expectation-Maximization (EM) under perform this heuristic. This paper shows that a recently introduced novel estimator based on smoothing might provide a good alternative. When all phrase pairs are estimated (no length cut-off), this estimator slightly outperforms the heuristic estimator.

[1]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[2]  Joshua Goodman,et al.  Parsing Inside-Out , 1998, ArXiv.

[3]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[4]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Chris Quirk,et al.  An Iteratively-Trained Segmentation-Free Phrase Translation Model for Statistical Machine Translation , 2007, WMT@ACL.

[7]  WuDekai Stochastic inversion transduction grammars and bilingual parsing of parallel corpora , 1997 .

[8]  Colin Cherry,et al.  Inversion Transduction Grammar for Joint Phrasal Translation Modeling , 2007, SSST@HLT-NAACL.

[9]  John DeNero,et al.  Why Generative Phrase Models Underperform Surface Heuristics , 2006, WMT@HLT-NAACL.

[10]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[11]  Daniel Gildea,et al.  Synchronous Binarization for Machine Translation , 2006, NAACL.

[12]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[13]  Khalil Sima'an,et al.  Phrase Translation Probabilities with ITG Priors and Smoothing as Learning Objective , 2008, EMNLP.

[14]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[15]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.