Linguistic Evaluation of Support Verb Constructions by OpenLogos and Google Translate

This paper presents a systematic human evaluation of translations of English support verb constructions produced by a rule-based machine translation (RBMT) system (OpenLogos) and a statistical machine translation (SMT) system (Google Translate) for five languages: French, German, Italian, Portuguese and Spanish. We classify support verb constructions by means of their syntactic structure and semantic behavior and present a qualitative analysis of their translation errors. The study aims to verify how machine translation (MT) systems translate fine-grained linguistic phenomena, and how well-equipped they are to produce high-quality translation. Another goal of the linguistically motivated quality analysis of SVC raw output is to reinforce the need for better system hybridization, which leverages the strengths of RBMT to the benefit of SMT, especially in improving the translation of multiword units. Taking multiword units into account, we propose an effective method to achieve MT hybridization based on the integration of semantico-syntactic knowledge into SMT.

[1]  Andy Way,et al.  Hybrid rule-based - example-based MT: feeding Apertium with sub-sentential translation units , 2009 .

[2]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[3]  Hans Uszkoreit,et al.  Hybrid machine translation architectures within and beyond the EuroMatrix project , 2008, EAMT.

[4]  Ralf D. Brown,et al.  Example-Based Machine Translation in the Pangloss System , 1996, COLING.

[5]  Andy Way,et al.  Comparing rule-based and data-driven approaches to Spanish-to-Basque machine translation , 2007, MTSUMMIT.

[6]  Noah A. Smith,et al.  Knowledge-Rich Morphological Priors for Bayesian Language Models , 2013, NAACL.

[7]  Fernando Batista,et al.  When Multiwords Go Bad in Machine Translation , 2013 .

[8]  Roland Kuhn,et al.  Rule-Based Translation with Statistical Phrase-Based Post-Editing , 2007, WMT@ACL.

[9]  Satoshi Sato Example-based machine translation , 1992 .

[10]  Bernd Kiefer,et al.  OpenLogos machine translation: philosophy, model, resources and customization , 2011, Machine Translation.

[11]  Philipp Koehn,et al.  Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System , 2007, WMT@ACL.

[12]  Andy Way,et al.  Example-Based Machine Translation via the Web , 2002, AMTA.

[13]  Maurice Gross,et al.  The Use of Finite Automata in the Lexical Representaion of Natural Language , 1987, Electronic Dictionaries and Automata in Computational Linguistics.

[14]  Alon Lavie,et al.  CMU System Combination in WMT 2011 , 2011, WMT@EMNLP.

[15]  Bernard Scott,et al.  The Logos Model: An Historical Perspective , 2003, Machine Translation.

[16]  Jakob Elming,et al.  Transformation-Based Correction of Rule-Based MT , 2006, EAMT.

[17]  Francis Bond,et al.  A Hybrid Rule and Example-Based Method for Machine Translation , 2003 .

[18]  EHARA Terumasa,et al.  Rule based machine translation combined with statistical post editor for Japanese to English patent translation , 2007, MTSUMMIT.