CharacTer: Translation Edit Rate on Character Level

Recently, the capability of character-level evaluation measures for machine translation output has been confirmed by several metrics. This work proposes translation edit rate on character level (CharacTER), which calculates the character level edit distance while performing the shift edit on word level. The novel metric shows high system-level correlation with human rankings, especially for morphologically rich languages. It outperforms the strong CHRF by up to 7% correlation on different metric tasks. In addition, we apply the hypothesis sentence length for normalizing the edit distance in CharacTER, which also provides significant improvements compared to using the reference sentence length.

[1]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[2]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[3]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[4]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[5]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[6]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[7]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[8]  Ondrej Bojar,et al.  Results of the WMT14 Metrics Shared Task , 2013 .

[9]  Lucia Specia,et al.  BLEU Deconstructed: Designing a Better MT Evaluation Metric , 2013, Int. J. Comput. Linguistics Appl..

[10]  Khalil Sima'an,et al.  BEER: BEtter Evaluation as Ranking , 2014, WMT@ACL.

[11]  Maja Popovic,et al.  chrF: character n-gram F-score for automatic MT evaluation , 2015, WMT@EMNLP.

[12]  Philipp Koehn,et al.  Results of the WMT15 Metrics Shared Task , 2015, WMT@EMNLP.

[13]  Ondrej Bojar,et al.  Results of the WMT13 Metrics Shared Task , 2015, WMT@EMNLP.

[14]  Ondrej Bojar,et al.  Results of the WMT16 Metrics Shared Task , 2016 .