论文信息 - The taraXÜ corpus of human-annotated machine translations

The taraXÜ corpus of human-annotated machine translations

Human translators are the key to evaluating machine translation (MT) quality and also to addressing the so far unanswered question when and how to use MT in professional translation workflows. This paper describes the corpus developed as a result of a detailed large scale human evaluation consisting of three tightly connected tasks: ranking, error classification and post-editing.

[1] Barbara J. Grosz,et al. Natural-Language Processing , 1982, Artificial Intelligence.

[2] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[3] Hermann Ney,et al. Jane: Open Source Hierarchical Translation, Extended with Reordering and Lexicon Models , 2010, WMT@ACL.

[4] Hans Uszkoreit,et al. Involving language professionals in the evaluation of machine translation , 2014, Lang. Resour. Evaluation.

[5] Hans Uszkoreit,et al. Learning from human judgments of machine translation output , 2013 .

[6] Christian Federmann,et al. Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations , 2010, LREC.

[7] Jörg Tiedemann,et al. News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[8] Philipp Koehn,et al. Findings of the 2011 Workshop on Statistical Machine Translation , 2011, WMT@EMNLP.