Quality Estimation of English-French Machine Translation: A Detailed Study of the Role of Syntax

We investigate the usefulness of syntactic knowledge in estimating the quality of English-French translations. We find that dependency and constituency tree kernels perform well but the error rate can be further reduced when these are combined with hand-crafted syntactic features. Both types of syntactic features provide information which is complementary to tried-and-tested nonsyntactic features. We then compare source and target syntax and find that the use of parse trees of machine translated sentences does not affect the performance of quality estimation nor does the intrinsic accuracy of the parser itself. However, the relatively flat structure of the French Treebank does appear to have an adverse effect, and this is significantly improved by simple transformations of the French trees. Finally, we provide further evidence of the usefulness of these transformations by applying them in a separate task ‐ parser accuracy prediction.

[1]  Hermann Ney,et al.  Confidence measures for statistical machine translation , 2003, MTSUMMIT.

[2]  Chris Quirk,et al.  The impact of parse quality on syntactically-informed statistical machine translation , 2006, EMNLP.

[3]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[4]  Yifan He,et al.  Identifying High-Impact Sub-Structures for Convolution Kernels in Document-level Sentiment Classification , 2012, ACL.

[5]  Chris Quirk,et al.  Training a Sentence-Level Machine Translation Confidence Measure , 2004, LREC.

[6]  Lucia Specia,et al.  Combining Confidence Estimation and Reference-based Metrics for Segment-level MT Evaluation , 2010, AMTA.

[7]  Josef van Genabith,et al.  Handling Unknown Words in Statistical Latent-Variable Parsing Models for Arabic, English and French , 2010, SPMRL@NAACL-HLT.

[8]  Joachim Wagner,et al.  DCU-Symantec Submission for the WMT 2012 Quality Estimation Task , 2012, WMT@NAACL-HLT.

[9]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[10]  Michael Gamon,et al.  Sentence-level MT evaluation without reference translations: beyond language modeling , 2005, EAMT.

[11]  Pascal Denis,et al.  Statistical French Dependency Parsing: Treebank Conversion and First Results , 2010, LREC.

[12]  Philipp Koehn,et al.  (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[13]  Eleftherios Avramidis,et al.  Quality estimation for Machine Translation output using linguistic analysis and decoding features , 2012, WMT@NAACL-HLT.

[14]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15]  Alexandra Kinyon,et al.  Building a Treebank for French , 2000, LREC.

[16]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[17]  Dietrich Klakow,et al.  Convolution Kernels for Opinion Holder Extraction , 2010, NAACL.

[18]  Josef van Genabith,et al.  Preparing, restructuring, and augmenting a French treebank:lexicalised parsers or coherent treebanks? , 2007 .

[19]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[20]  Alex Kulesza,et al.  Confidence Estimation for Machine Translation , 2004, COLING.

[21]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[22]  Nello Cristianini,et al.  Estimating the Sentence-Level Quality of Machine Translation Systems , 2009, EAMT.

[23]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[24]  Jörg Tiedemann,et al.  Tree Kernels for Machine Translation Quality Estimation , 2012, WMT@NAACL-HLT.

[25]  Philipp Koehn,et al.  Findings of the 2013 Workshop on Statistical Machine Translation , 2013, WMT@ACL.

[26]  Philipp Koehn,et al.  Findings of the 2012 Workshop on Statistical Machine Translation , 2012, WMT@NAACL-HLT.

[27]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[28]  Christopher D. Manning,et al.  The Stanford Typed Dependencies Representation , 2008, CF+CDPE@COLING.

[29]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[30]  Jennifer Foster,et al.  Parser Accuracy in Quality Estimation of Machine Translation: A Tree Kernel Approach , 2013, IJCNLP.

[31]  Alessandro Moschitti,et al.  Making Tree Kernels Practical for Natural Language Learning , 2006, EACL.

[32]  Khalid Choukri,et al.  Assessing Human and Automated Quality Judgments in the French MT Evaluation Campaign CESTA , 2007 .

[33]  Alon Lavie,et al.  Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.

[34]  Kevin Knight,et al.  Automatic Prediction of Parser Accuracy , 2008, EMNLP.