Machine Translation at Work

Machine translation (MT) is - not only historically - a prime application of language technology. After years of seeming stagnation, the price pressure on language service providers (LSPs) and the increased translation need have led to new momentum for the inclusion of MT in industrial translation workflows. On the research side, this trend is backed by improvements in translation performance, especially in the area of hybrid MT approaches. Nevertheless, it is clear that translation quality is far from perfect in many applications. Therefore, human post-editing today seems the only way to go. This chapter reports on a system that is being developed as part of taraXŰ, an ongoing joint project between industry and research partners. By combining state-of-the-art language technology applications, developing informed selection mechanisms using the outputs of different MT engines, and incorporating qualified translator feedback throughout the development process, the project aims to make MT economically feasible and technically usable.

[1]  Philipp Koehn,et al.  A process study of computer-aided translation , 2009, Machine Translation.

[2]  Francisco Casacuberta,et al.  Human interaction for high-quality machine translation , 2009, CACM.

[3]  Eleftherios Avramidis,et al.  DFKI System Combination with Sentence Ranking at ML4HMT-2011 , 2011 .

[4]  Philipp Koehn,et al.  Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.

[5]  Philipp Koehn,et al.  Proceedings of the Third Workshop on Statistical Machine Translation , 2008, WMT@ACL.

[6]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[7]  Hermann Ney,et al.  Error Analysis of Statistical Machine Translation Output , 2006, LREC.

[8]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[9]  Lucia Specia,et al.  Exploiting Objective Annotations for Minimising Translation Post-editing Effort , 2011, EAMT.

[10]  Philipp Koehn,et al.  Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, WMT@ACL 2010, Uppsala, Sweden, July 15-16, 2010 , 2010, WMT@ACL.

[11]  Maja Popovic Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output , 2011, Prague Bull. Math. Linguistics.

[12]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[13]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[14]  Aljoscha Burchardt,et al.  From Human to Automatic Error Classification for Machine Translation Output , 2011, EAMT.

[15]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[16]  Hans Uszkoreit,et al.  The German Language in the Digital Age , 2012, White Paper Series.

[17]  W. A. Scott,et al.  Reliability of Content Analysis ; The Case of Nominal Scale Cording , 1955 .

[18]  Philipp Koehn,et al.  Proceedings of the Fourth Workshop on Statistical Machine Translation, WMT@EACL 2009, Athens, Greece, March 30-31, 2009 , 2009, WMT@EACL.

[19]  Philipp Koehn,et al.  Findings of the 2009 Workshop on Statistical Machine Translation , 2009, WMT@EACL.

[20]  Hans Uszkoreit,et al.  Translation Combination using Factored Word Substitution , 2009, WMT@EACL.

[21]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[22]  Josef van Genabith,et al.  The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation , 2012, LREC.

[23]  Chris Callison-Burch,et al.  Joshua 2.0: A Toolkit for Parsing-Based Machine Translation with Syntax, Semirings, Discriminative Training and Other Goodies , 2010, WMT@ACL.

[24]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[25]  Christian Federmann,et al.  Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations , 2010, LREC.

[26]  Philipp Koehn,et al.  Further Meta-Evaluation of Machine Translation , 2008, WMT@ACL.

[27]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[28]  Hans Uszkoreit,et al.  Combining Multi-Engine Translations with Moses , 2009, WMT@EACL.