论文信息 - Measuring Confidence Intervals for MT Evaluation Metrics - 字舞流文

Measuring Confidence Intervals for MT Evaluation Metrics

[1] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[2] Ying Zhang,et al. Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System? , 2004, LREC.

[3] George R. Doddington,et al. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[4] Robert Tibshirani,et al. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[5] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6] I. Dan Melamed,et al. Precision and Recall of Machine Translation , 2003, NAACL.

[7] H. Ney,et al. A novel string-to-string distance measure with applications to machine translation evaluation , 2003, MTSUMMIT.

[8] Hermann Ney,et al. An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.

[9] Hermann Ney,et al. Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10] M. King,et al. FEMTI: creating and using a framework for MT evaluation , 2003, MTSUMMIT.