论文信息 - Meteor++: Incorporating Copy Knowledge into Machine Translation Evaluation

Meteor++: Incorporating Copy Knowledge into Machine Translation Evaluation

In machine translation evaluation, a good candidate translation can be regarded as a paraphrase of the reference. We notice that some words are always copied during paraphrasing, which we call copy knowledge. Considering the stability of such knowledge, a good candidate translation should contain all these words appeared in the reference sentence. Therefore, in this participation of the WMT’2018 metrics shared task we introduce a simple statistical method for copy knowledge extraction, and incorporate it into Meteor metric, resulting in a new machine translation metric Meteor++. Our experiments show that Meteor++ can nicely integrate copy knowledge and improve the performance significantly on WMT17 and WMT15 evaluation sets.

[1] Timothy Baldwin,et al. Continuous Measurement Scales in Human Evaluation of Machine Translation , 2013, LAW@ACL.

[2] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3] Alon Lavie,et al. Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.

[4] Hang Li,et al. Paraphrase Generation with Deep Reinforcement Learning , 2017, EMNLP.

[5] Philipp Koehn,et al. Results of the WMT15 Metrics Shared Task , 2015, WMT@EMNLP.

[6] Alon Lavie,et al. Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[7] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[8] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[9] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[10] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[11] Ondrej Bojar,et al. Results of the WMT17 Metrics Shared Task , 2017, WMT.

[12] Philipp Koehn,et al. Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.

[13] Steven Bird,et al. NLTK: The Natural Language Toolkit , 2002, ACL.