Exploiting Objective Annotations for Minimising Translation Post-editing Effort

With the noticeable improvement of the overall quality of Machine Translation (MT) systems in recent years, post-editing of MT output is starting to become a common practice among human translators. However, it is well known that the quality of a given MT system can vary significantly across translation segments and that post-editing bad quality translations is a tedious task that may require more effort than translating texts from scratch. Previous research dedicated to learning quality estimation models to flag such segments has shown that models based on human annotation achieve more promising results. However, it is not clear yet what is the most appropriate form of human annotation for building such models. We experiment with models based on three annotation types (post-editing time, post-editing distance and post-editing effort scores) and show that estimations resulting from using post-editing time, a simple and objective annotation, can minimise translation postediting effort in a practical, task-based scenario. We also discuss some perspectives on the effectiveness, reliability and cost of each type of annotation.

[1]  Hermann Ney,et al.  Accelerated DP based search for statistical translation , 1997, EUROSPEECH.

[2]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[3]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[4]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[5]  Alex Kulesza,et al.  Confidence Estimation for Machine Translation , 2004, COLING.

[6]  Chris Quirk,et al.  Training a Sentence-Level Machine Translation Confidence Measure , 2004, LREC.

[7]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[8]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[9]  Nitin Madnani,et al.  TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate , 2009, Machine Translation.

[10]  Lucia Specia,et al.  Improving the Confidence of Machine Translation Quality Estimates , 2009, MTSUMMIT.

[11]  Nello Cristianini,et al.  Estimating the Sentence-Level Quality of Machine Translation Systems , 2009, EAMT.

[12]  Radu Soricut,et al.  TrustRank: Inducing Trust in Automatic Translations via Ranking , 2010, ACL.

[13]  Lucia Specia,et al.  A Dataset for Assessing Machine Translation Evaluation Metrics , 2010, LREC.

[14]  Lucia Specia,et al.  Estimating Machine Translation Post-Editing Effort with HTER , 2010, JEC.

[15]  Yifan He,et al.  Bridging SMT and TM with Translation Recommendation , 2010, ACL.

[16]  Philipp Koehn,et al.  Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.