Measuring Confidence Intervals for MT Evaluation Metrics
暂无分享,去创建一个
[1] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.
[2] Ying Zhang,et al. Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System? , 2004, LREC.
[3] George R. Doddington,et al. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .
[4] Robert Tibshirani,et al. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .
[5] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[6] I. Dan Melamed,et al. Precision and Recall of Machine Translation , 2003, NAACL.
[7] H. Ney,et al. A novel string-to-string distance measure with applications to machine translation evaluation , 2003, MTSUMMIT.
[8] Hermann Ney,et al. An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.
[9] Hermann Ney,et al. Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[10] M. King,et al. FEMTI: creating and using a framework for MT evaluation , 2003, MTSUMMIT.