论文信息 - TerrorCat: a Translation Error Categorization-based MT Quality Metric

TerrorCat: a Translation Error Categorization-based MT Quality Metric

We present TerrorCat, a submission to the WMT'12 metrics shared task. TerrorCat uses frequencies of automatically obtained translation error categories as base for pairwise comparison of translation hypotheses, which is in turn used to generate a score for every translation. The metric shows high overall correlation with human judgements on the system level and more modest results on the level of individual sentences.

[1] Hwee Tou Ng,et al. TESLA at WMT 2011: Translation Evaluation and Tunable Metric , 2011, WMT@EMNLP.

[2] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[3] Ondrej Bojar,et al. Addicter: What Is Wrong with My Translations? , 2011, Prague Bull. Math. Linguistics.

[4] John C. Platt. Using Analytic QP and Sparseness to Speed Training of Support Vector Machines , 1998, NIPS.

[5] Nitin Madnani,et al. E-rating Machine Translation , 2011, WMT@EMNLP.

[6] Philipp Koehn,et al. Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.

[7] Thorsten Joachims,et al. Training linear SVMs in linear time , 2006, KDD '06.

[8] Jan Hajič,et al. The Best of Two Worlds: Cooperation of Statistical and Rule-Based Taggers for Czech , 2007, ACL 2007.

[9] Ondrej Bojar,et al. Terra: a Collection of Translation Error-Annotated Corpora , 2012, LREC.

[10] Ben Taskar,et al. Alignment by Agreement , 2006, NAACL.

[11] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12] Hermann Ney,et al. Towards Automatic Error Analysis of Machine Translation Output , 2011, CL.

[13] Helmut Schmid,et al. Improvements in Part-of-Speech Tagging with an Application to German , 1999 .