论文信息 - “This sentence is wrong.” Detecting errors in machine-translated sentences

“This sentence is wrong.” Detecting errors in machine-translated sentences

Machine translation systems are not reliable enough to be used “as is”: except for the most simple tasks, they can only be used to grasp the general meaning of a text or assist human translators. The purpose of confidence measures is to detect erroneous words or sentences produced by a machine translation system. In this article, after reviewing the mathematical foundations of confidence estimation, we propose a comparison of several state-of-the-art confidence measures, predictive parameters and classifiers. We also propose two original confidence measures based on Mutual Information and a method for automatically generating data for training and testing classifiers. We applied these techniques to data from the WMT campaign 2008 and found that the best confidence measures yielded an Equal Error Rate of 36.3% at word level and 34.2% at sentence level, but combining different measures reduced these rates to 35.0% and 29.0%, respectively. We also present the results of an experiment aimed at determining how helpful confidence measures are in a post-editing task. Preliminary results suggest that our system is not yet ready to efficiently help post-editors, but we now have both software and a protocol that we can apply to further experiments, and user feedback has indicated aspects which must be improved in order to increase the level of helpfulness of confidence measures.

Kamel Smaïli | Sylvain Raybaud | David Langlois

[1] Kamel Smaïli,et al. Using inter-lingual triggers for machine translation , 2007, INTERSPEECH.

[2] Nello Cristianini,et al. Estimating the Sentence-Level Quality of Machine Translation Systems , 2009, EAMT.

[3] Kamel Smaïli,et al. New Confidence Measures for Statistical Machine Translation , 2009, ICAART.

[4] Patrick Wambacq,et al. Confidence scoring based on backward language models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[6] R. Tobias. An Introduction to Partial Least Squares Regression , 1996 .

[7] Laurene V. Fausett,et al. Fundamentals Of Neural Networks , 1993 .

[8] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[9] Helmut Schmidt,et al. Probabilistic part-of-speech tagging using decision trees , 1994 .

[10] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[11] Hermann Ney,et al. Word-Level Confidence Estimation for Machine Translation using Phrase-Based Translation Models , 2005, HLT.