Linguistic-based Evaluation Criteria to identify Statistical Machine Translation Errors

Machine translation evaluation methods are highly necessary in order to analyze the performance of translation systems. Up to now, the most traditional methods are the use of automatic measures such as BLEU or the quality perception performed by native human evaluations. In order to complement these traditional procedures, the current paper presents a new human evaluation based on the expert knowledge about the errors encountered at several linguistic levels: orthographic, morphological, lexical, semantic and syntactic. The results obtained in these experiments show that some linguistic errors could have more influence than other at the time of performing a perceptual evaluation.

[1]  Mary A. Flanagan,et al.  Error Classification for MT Evaluation , 1994, AMTA.

[2]  José B. Mariño,et al.  Morpho-syntactic Information for Automatic Error Analysis of Statistical Machine Translation Output , 2006, WMT@HLT-NAACL.

[3]  Srinivas Bangalore,et al.  Finite-state models for lexical reordering in spoken language translation , 2000, INTERSPEECH.

[4]  Hermann Ney,et al.  Towards the Use of Word Stems and Suffixes for Statistical Machine Translation , 2004, LREC.

[5]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Hervé Bourlard,et al.  On the Use of Information Retrieval Measures for Speech Recognition Evaluation , 2004 .

[8]  Enrique Vidal,et al.  Finite-state speech-to-speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[10]  José B. Mariño,et al.  N-gram-based Machine Translation , 2006, CL.

[11]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[12]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[13]  Philipp Koehn,et al.  Findings of the 2009 Workshop on Statistical Machine Translation , 2009, WMT@EACL.

[14]  José B. Mariño,et al.  Using x-grams for speech-to-speech translation , 2002, INTERSPEECH.

[15]  José B. Mariño,et al.  Improving a Catalan-Spanish Statistical Translation System using Morphosyntactic Knowledge , 2009, EAMT.

[16]  Hermann Ney,et al.  Improving SMT quality with morpho-syntactic analysis , 2000, COLING.

[17]  Hermann Ney,et al.  Error Analysis of Statistical Machine Translation Output , 2006, LREC.

[18]  Francisco Casacuberta Finite-state transducers for speech-input translation , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[19]  Hermann Ney,et al.  POS-based Word Reorderings for Statistical Machine Translation , 2006, LREC.

[20]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .