Meteor, M-BLEU and M-TER: Evaluation Metrics for High-Correlation with Human Rankings of Machine Translation Output
暂无分享,去创建一个
[1] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[2] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.
[3] I. Dan Melamed,et al. Precision and Recall of Machine Translation , 2003, NAACL.
[4] Chin-Yew Lin,et al. ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation , 2004, COLING.
[5] Alon Lavie,et al. The significance of recall in automatic metrics for MT evaluation , 2004, AMTA.
[6] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[7] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .
[8] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.
[9] Hermann Ney,et al. CDER: Efficient MT Evaluation Using Block Movements , 2006, EACL.
[10] Ding Liu,et al. Stochastic Iterative Alignment for Machine Translation Evaluation , 2006, ACL.
[11] Philipp Koehn,et al. (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.
[12] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.
[13] Ming Zhou,et al. Sentence Level Machine Translation Evaluation as a Ranking , 2007, WMT@ACL.
[14] Wolfgang Macherey,et al. Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.