Feature-Rich Discriminative Phrase Rescoring for SMT

This paper proposes a new approach to phrase rescoring for statistical machine translation (SMT). A set of novel features capturing the translingual equivalence between a source and a target phrase pair are introduced. These features are combined with linear regression model and neural network to predict the quality score of the phrase translation pair. These phrase scores are used to discriminatively rescore the baseline MT system's phrase library: boost good phrase translations while prune bad ones. This approach not only significantly improves machine translation quality, but also reduces the model size by a considerable margin.

[1]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[2]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[3]  Ying Zhang,et al.  Measuring confidence intervals for the machine translation evaluation metrics , 2004, TMI.

[4]  Joel D. Martin,et al.  Improving Translation Quality by Discarding Most of the Phrasetable , 2007, EMNLP.

[5]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[6]  George M. Siouris,et al.  Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[8]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[9]  Mei Yang,et al.  Toward Smaller, Faster, and Better Hierarchical Phrase-based SMT , 2009, ACL.

[10]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[11]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[12]  Christoph Tillmann,et al.  Efficient Dynamic Programming Search Algorithms for Phrase-Based SMT , 2006 .

[13]  Jia Xu,et al.  Phrase Table Training for Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair? , 2008, ACL.

[14]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15]  Alexander H. Waibel,et al.  Phrase Pair Rescoring with Term Weighting for Statistical Machine Translation , 2004, EMNLP.

[16]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.