论文信息 - Stabilizing Minimum Error Rate Training

Stabilizing Minimum Error Rate Training

The most commonly used method for training feature weights in statistical machine translation (SMT) systems is Och's minimum error rate training (MERT) procedure. A well-known problem with Och's procedure is that it tends to be sensitive to small changes in the system, particularly when the number of features is large. In this paper, we quantify the stability of Och's procedure by supplying different random seeds to a core component of the procedure (Powell's algorithm). We show that for systems with many features, there is extensive variation in outcomes, both on the development data and on the test data. We analyze the causes of this variation and propose modifications to the MERT procedure that improve stability while helping performance on test data.

Roland Kuhn | George F. Foster | R. Kuhn

[1] Hermann Ney,et al. A Systematic Comparison of Training Criteria for Statistical Machine Translation , 2007, EMNLP-CoNLL.

[2] Kevin Duh,et al. Beyond Log-Linear Models: Boosted Minimum Error Rate Training for N-best Re-ranking , 2008, ACL.

[3] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[4] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[5] Daniel Jurafsky,et al. Regularization and Search for Minimum Error Rate Training , 2008, WMT@ACL.

[6] Hermann Ney,et al. Improvements in Phrase-Based Statistical Machine Translation , 2004, NAACL.

[7] Mauro Cettolo,et al. Minimum error training of log-linear translation models , 2004, IWSLT.

[8] William H. Press,et al. Numerical recipes in C , 2002 .

[9] Chris Quirk,et al. Random Restarts in Minimum Error Rate Training for Statistical Machine Translation , 2008, COLING.

[10] Wolfgang Macherey,et al. Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[11] David Chiang,et al. A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[12] David Chiang,et al. Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.