论文信息 - Corpus-based comprehensive and diagnostic MT evaluation: initial Arabic, Chinese, French, and Spanish results

Corpus-based comprehensive and diagnostic MT evaluation: initial Arabic, Chinese, French, and Spanish results

We describe two metrics for automatic evaluation of machine translation quality. These metrics, BLEU and NEE, are compared to human judgment of quality of translation of Arabic, Chinese, French, and Spanish documents into English.

[1] Lynette Hirschman,et al. Mixed-Initiative Development of Language Processing Systems , 1997, ANLP.

[2] Kevin Knight,et al. Machine Transliteration , 1997, CL.

[3] F. Reeder,et al. The naming of things and the confusion of tongues: an MT metric , 2001, MTSUMMIT.

[4] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.