Findings of the 2009 Workshop on Statistical Machine Translation

This paper presents the results of the WMT09 shared tasks, which included a translation task, a system combination task, and an evaluation task. We conducted a large-scale manual evaluation of 87 machine translation systems and 22 system combination entries. We used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality, for more than 20 metrics. We present a new evaluation technique whereby system output is edited and judged for correctness.

[1]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[2]  Robert C. Moore Fast and accurate sentence alignment of bilingual corpora , 2002, AMTA.

[3]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[4]  Chin-Yew Lin,et al.  Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.

[5]  S. Shieber,et al.  A learning approach to improving sentence-level MT evaluation , 2004, TMI.

[6]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[7]  Philipp Koehn,et al.  Manual and Automatic Evaluation of Machine Translation between European Languages , 2006, WMT@HLT-NAACL.

[8]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[9]  Jason Eisner,et al.  Local Search with Very Large-Scale Neighborhoods for Optimal Permutations in Machine Translation , 2006 .

[10]  Philipp Koehn,et al.  (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[11]  Rebecca Hwa,et al.  Regression for Sentence-Level MT Evaluation with Pseudo References , 2007, ACL.

[12]  Lluís Màrquez i Villodre,et al.  Linguistic Features for Automatic Evaluation of Heterogenous MT Systems , 2007, WMT@ACL.

[13]  Miles Osborne,et al.  Smoothed Bloom Filter Language Models: Tera-Scale LMs on the Cheap , 2007, EMNLP.

[14]  Rebecca Hwa,et al.  A Re-examination of Machine Learning Approaches for Sentence-Level MT Evaluation , 2007, ACL.

[15]  Lluís Màrquez i Villodre,et al.  A Smorgasbord of Features for Automatic MT Evaluation , 2008, WMT@ACL.

[16]  Yee Seng Chan,et al.  An Automatic Metric for Machine Translation Evaluation Based on Maximum Similarity , 2008 .

[17]  Alon Lavie,et al.  Meteor, M-BLEU and M-TER: Evaluation Metrics for High-Correlation with Human Rankings of Machine Translation Output , 2008, WMT@ACL.

[18]  Olivier Galibert,et al.  Limsi’s Statistical Translation Systems for WMT‘08 , 2008, WMT@ACL.

[19]  Philipp Koehn,et al.  Further Meta-Evaluation of Machine Translation , 2008, WMT@ACL.

[20]  Chris Callison-Burch,et al.  Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation , 2009, ACL.

[21]  Jan Niehues,et al.  The Universität Karlsruhe Translation System for the EACL-WMT 2009 , 2009, WMT@EACL.

[22]  Attila Novák MorphoLogic's Submission for the WMT 2009 Shared Task , 2009, WMT@EACL.

[23]  Yves Scherrer,et al.  Deep Linguistic Multilingual Translation and Bilingual Dictionaries , 2009, WMT@EACL.

[24]  José B. Mariño,et al.  The TALP-UPC Phrase-Based Translation System for EACL-WMT 2009 , 2009, WMT@EACL.

[25]  Philipp Koehn,et al.  Word Lattices for Multi-Source Translation , 2009, EACL.

[26]  Marine Carpuat,et al.  Toward Using Morphology in French-English Phrase-Based SMT , 2009, WMT@EACL.

[27]  Eiichiro Sumita,et al.  NICT@WMT09: Model Adaptation and Transliteration for Spanish-English SMT , 2009, WMT@EACL.

[28]  Nitin Madnani,et al.  Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric , 2009, WMT@EACL.

[29]  Hermann Ney,et al.  The RWTH Machine Translation System for WMT 2009 , 2009, WMT@EACL.

[30]  Alexander M. Fraser,et al.  Experiments in Morphosyntactic Processing for Translating to and from German , 2009, WMT@EACL.

[31]  Gregory A. Sanders,et al.  The NIST 2008 Metrics for machine translation challenge—overview, methodology, metrics, and results , 2009, Machine Translation.

[32]  Vladimir Eidelman,et al.  The University of Maryland Statistical Machine Translation System for the Fourth Workshop on Machine Translation , 2009, WMT@ACL.

[33]  Andy Way,et al.  MATREX: The DCU MT System for WMT 2009 , 2009, WMT@EACL.

[34]  Philipp Koehn,et al.  Edinburgh’s Submission to all Tracks of the WMT 2009 Shared Task with Reordering and Speed Improvements to Moses , 2009, WMT@EACL.

[35]  Daniel Jurafsky,et al.  Machine Translation Evaluation with Textual Entailment Features , 2009, WMT@EACL.

[36]  Stephan Vogel,et al.  CMU System Combination for WMT'09 , 2009, WMT@EACL.

[37]  Alexandra Birch,et al.  Proceedings of the Fourth Workshop on Statistical Machine Translation 2009 , 2009 .

[38]  Richard M. Schwartz,et al.  Incremental Hypothesis Alignment with Flexible Matching for Building Confusion Networks: BBN System Description for WMT09 System Combination Task , 2009, WMT@EACL.

[39]  Hans Uszkoreit,et al.  Combining Multi-Engine Translations with Moses , 2009, WMT@EACL.

[40]  Holger Schwenk,et al.  SMT and SPE Machine Translation Systems for WMT‘09 , 2009, WMT@EACL.

[41]  Philipp Koehn,et al.  Statistical Post Editing and Dictionary Extraction: Systran/Edinburgh Submissions for ACL-WMT2009 , 2009, WMT@EACL.

[42]  Alon Lavie,et al.  An Improved Statistical Transfer System for French-English Machine Translation , 2009, WMT@EACL.

[43]  Ondrej Bojar,et al.  Evaluation of Machine Translation Metrics for Czech as the Target Language , 2009, Prague Bull. Math. Linguistics.

[44]  Hans Uszkoreit,et al.  Translation Combination using Factored Word Substitution , 2009, WMT@EACL.

[45]  Hermann Ney,et al.  Syntax-Oriented Evaluation Measures for Machine Translation Output , 2009, WMT@EACL.

[46]  Hwee Tou Ng,et al.  NUS at WMT09: Domain Adaptation Experiments for English-Spanish Machine Translation of News Commentary Text , 2009, WMT@EACL.

[47]  Sara Stymne,et al.  Improving Alignment for SMT by Reordering and Augmenting the Training Corpus , 2009, WMT@EACL.

[48]  Ondrej Bojar,et al.  English-Czech MT in 2008 , 2009, WMT@EACL.

[49]  Alon Lavie,et al.  Machine Translation System Combination with Flexible Word Ordering , 2009, WMT@EACL.

[50]  Andy Way,et al.  MaTrEx: The DCU MT System for WMT 2008 , 2008, WMT@ACL.