论文信息 - Fitting Sentence Level Translation Evaluation with Many Dense Features

Fitting Sentence Level Translation Evaluation with Many Dense Features

Sentence level evaluation in MT has turned out far more difficult than corpus level evaluation. Existing sentence level metrics employ a limited set of features, most of which are rather sparse at the sentence level, and their intricate models are rarely trained for ranking. This paper presents a simple linear model exploiting 33 relatively dense features, some of which are novel while others are known but seldom used, and train it under the learning-to-rank framework. We evaluate our metric on the standard WMT12 data showing that it outperforms the strong baseline METEOR. We also analyze the contribution of individual features and the choice of training data, language-pair vs. target-language data, providing new insights into this task.

Khalil Sima

[1] Philipp Koehn,et al. Findings of the 2012 Workshop on Statistical Machine Translation , 2012, WMT@NAACL-HLT.

[2] Mark Hopkins,et al. Tuning as Ranking , 2011, EMNLP.

[3] Daniel Gildea,et al. Extracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time , 2008, COLING.

[4] Alon Lavie,et al. Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[5] Kevin Duh,et al. Automatic Evaluation of Translation Quality for Distant Language Pairs , 2010, EMNLP.

[6] Khalil Sima'an,et al. Hierarchical Translation Equivalence over Word Alignments , 2011 .

[7] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8] Khalil Sima'an,et al. BEER: BEtter Evaluation as Ranking , 2014, WMT@ACL.

[9] Daniel Gildea,et al. Factorization of Synchronous Context-Free Grammars in Linear Time , 2007, SSST@HLT-NAACL.

[10] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[11] Ming Zhou,et al. Sentence Level Machine Translation Evaluation as a Ranking , 2007, WMT@ACL.

[12] Tiejun Zhao,et al. Fusion of Word and Letter Based Metrics for Automatic MT Evaluation , 2013, IJCAI.

[13] Klaus Obermayer,et al. Support vector learning for ordinal regression , 1999 .