ATEC: automatic evaluation of machine translation via word choice and word order

We propose a novel metric ATEC for automatic MT evaluation based on explicit assessment of word choice and word order in an MT output in comparison to its reference translation(s), the two most fundamental factors in the construction of meaning for a sentence. The former is assessed by matching word forms at various linguistic levels, including surface form, stem, sound and sense, and further by weighing the informativeness of each word. The latter is quantified in term of the discordance of word position and word sequence between a translation candidate and its reference. In the evaluations using the MetricsMATR08 data set and the LDC MTC2 and MTC4 corpora, ATEC demonstrates an impressive positive correlation to human judgments at the segment level, highly comparable to the few state-of-the-art evaluation metrics.

[1]  Bogdan Babych,et al.  Extending the BLEU MT Evaluation Method with Frequency Weightings , 2004, ACL.

[2]  Ding Liu,et al.  Syntactic Features for Evaluation of Machine Translation , 2005, IEEvaluation@ACL.

[3]  Andy Way,et al.  Dependency-Based Automatic Evaluation for Machine Translation , 2007, SSST@HLT-NAACL.

[4]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[5]  Thomas K. Landauer,et al.  On the computational basis of learning and cognition: Arguments from LSA , 2002 .

[6]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[7]  George D. Gopen The Sense of Structure: Writing from the Reader's Perspective , 2004 .

[8]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[9]  Steven Krauwer Evaluation of MT systems: A programmatic view , 2004, Machine Translation.

[10]  Cyril Goutte Automatic Evaluation of Machine Translation Quality , 2006 .

[11]  John S. White Contemplating Automatic MT Evaluation , 2000, AMTA.

[12]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[14]  Lluís Màrquez i Villodre,et al.  Linguistic Features for Automatic Evaluation of Heterogenous MT Systems , 2007, WMT@ACL.

[15]  John S. White Envisioning Machine Translation in the Information Future: 4th Conference of the Association for Machine Translation in the Americas, AMTA 2000, Cuernavaca, Mexico, October 10-14, 2000 Proceedings , 2000 .

[16]  George A. Miller,et al.  Some psychological methods for evaluating the quality of translations , 1956, Mech. Transl. Comput. Linguistics.