论文信息 - A Systematic Comparison of Training Criteria for Statistical Machine Translation

A Systematic Comparison of Training Criteria for Statistical Machine Translation

We address the problem of training the free parameters of a statistical machine translation system. We show significant improvements over a state-of-the-art minimum error rate training baseline on a large ChineseEnglish translation task. We present novel training criteria based on maximum likelihood estimation and expected loss computation. Additionally, we compare the maximum a-posteriori decision rule and the minimum Bayes risk decision rule. We show that, not only from a theoretical point of view but also in terms of translation quality, the minimum Bayes risk decision rule is preferable.

Hermann Ney | Richard Zens | Sasa Hasan

[1] William H. Press,et al. Numerical recipes in C , 2002 .

[2] Hermann Ney,et al. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[3] Hermann Ney. Stochastic Modelling: From Pattern Classification to Language Translation , 2001, DDMMT@ACL.

[4] John Cocke,et al. A Statistical Approach to Machine Translation , 1990, CL.

[5] Hermann Ney,et al. N-Gram Posterior Probabilities for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[6] Ashish Venugopal. Training and Evaluating Error Minimization Rules for Statistical Machine Translation , 2005 .

[7] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[8] Hermann Ney,et al. The RWTH statistical machine translation system for the IWSLT 2006 evaluation , 2006, IWSLT.

[9] Philipp Koehn,et al. Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[10] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[11] Chin-Yew Lin,et al. ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation , 2004, COLING.