论文信息 - Unobtrusive methods for low-cost manual evaluation of machine translation.

Unobtrusive methods for low-cost manual evaluation of machine translation.

Machine translation (MT) evaluation metrics based on n-gram co-occurrence statistics are financially cheap to execute and their value in comparative research is well documented. However, their value as a standalone measure of MT output quality is questionable. In contrast, manual methods of MT evaluation are financially expensive. This paper will present early research being carried out within the CNGL (Centre for Next Generation Localisation) on a low-cost means of acquiring MT evaluation data in an operationalised manner in a commercial post-edited MT (PEMT) context. An approach to MT evaluation will be presented which exposes translators to output from a set of candidate MT systems and reports back on which system requires the least post-editing. It is hoped that this approach, combined with instrumentation mechanisms for tracking the performance and behaviour of individual post-editors, will give insight into which MT system, if any, out of a set of candidate systems is most suitable for a particular large or ongoing technical translation project. For the longer term we propose that post-editing data gathered in a commercial context may be valuable to MT researchers.

David Lewis | John Moran

[1] Sharon O’Brien,et al. Can MT Output Be Evaluated Through Eye Tracking? , 2009, MTSUMMIT.

[2] Bruce Phillips,et al. Tracking real-time user experience (TRUE): a comprehensive instrumentation solution for complex systems , 2008, CHI.

[3] Takako Aikawa,et al. Impact of Controlled Language on Translation Quality and Post-editing in a Statistical Machine Translation Environment , 2007 .

[4] Ana Guerberof Arenas. Productivity and Quality in MT Post-editing , 2009, MTSUMMIT.

[5] François Masselot,et al. A Productivity Test of Statistical Machine Translation Post-Editing in a Typical Localisation Context , 2010, Prague Bull. Math. Linguistics.

[6] Philipp Koehn,et al. Findings of the 2009 Workshop on Statistical Machine Translation , 2009, WMT@EACL.

[7] Philipp Koehn,et al. Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation , 2010, WMT@ACL.

[8] M. Tatsumi. Correlation between Automatic Evaluation Metric Scores, Post-Editing Speed, and Some Other Factors , 2009, MTSUMMIT.

[9] Deborah A. Coughlin,et al. Correlating automated and human assessments of machine translation quality , 2003, MTSUMMIT.

[10] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[11] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .