Towards a Corpus-based, Statistical Approach to Translation Quality: Measuring and Visualizing Linguistic Deviance in Student Translations

In this article we present a corpus-based statistical approach to measuring translation quality, more particularly translation acceptability, by comparing the features of translated and original texts. We discuss initial findings that aim to support and objectify formative quality assessment. To that end, we extract a multitude of linguistic and textual features from both student and professional translation corpora that consist of many different translations by several translators in two different genres (fiction, news) and in two translation directions (English to French and French to Dutch). The numerical information gathered from these corpora is exploratively analysed with Principal Component Analysis, which enables us to identify stable, language-independent linguistic and textual indicators of student translations compared to translations produced by professionals. The differences between these types of translation are subsequently tested by means of ANOVA. The results clearly indicate that the proposed methodology is indeed capable of distinguishing between student and professional translations. It is claimed that this deviant behaviour indicates an overall lower translation quality in student translations: student translations tend to score lower at the acceptability level, that is, they deviate significantly from target-language norms and conventions. In addition, the proposed methodology is capable of assessing the acceptability of an individual student’s translation – a smaller linguistic distance between a given student translation and the norm set by the professional translations correlates with higher quality. The methodology is also able to provide objective and concrete feedback about the divergent linguistic dimensions in their text.

[1]  Alina Secar Translation Evaluation-a State of the Art Survey , 2006 .

[2]  Lynne Bowker,et al.  Towards a Methodology for a Corpus-Based Approach to Translation Evaluation , 2001 .

[3]  Natalie Kübler Corpora and LSP Translation , 2014 .

[4]  Stefan Evert,et al.  The impact of translation direction on characteristics of translated texts. A multivariate analysis for English and German , 2017 .

[5]  Richard Xiao,et al.  How different is translated Chinese from native Chinese , 2009 .

[6]  Marc Brysbaert,et al.  Lexique 2 : A new French lexical database , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[7]  Bert Cappelle,et al.  Typological differences shining through: The case of phrasal verbs in translated English , 2017 .

[8]  Els Lefever,et al.  LeTs Preprocess: The multilingual LT3 linguistic preprocessing toolkit , 2013, CLIN 2013.

[9]  Rudy Loock L’utilisation des corpus électroniques chez le traducteur professionnel : quand ? comment ? pour quoi faire ? , 2016 .

[10]  Ana Frankenberg-Garcia Training translators to use corpora hands-on: challenges and reactions by a group of thirteen students at a UK university , 2015 .

[11]  Haidee Kruger,et al.  What’s happening when nothing’s happening? Combining eyetracking and keylogging to explore cognitive processing during pauses in translation production , 2016 .

[12]  Rosa Rabadán,et al.  Corpus-based contrastive analysis and translation universals: a tool for translation quality assessment english-spanish , 2009 .

[13]  Mona Baker,et al.  REPORTING THAT IN TRANSLATED ENGLISH. EVIDENCE FOR SUBCONSCIOUS PROCESSES OF EXPLICITATION , 2000 .

[14]  Joke Daems,et al.  A translation robot for each translator? A comparative study of manual translation and post-editing of machine translations: process, quality and translator attitude , 2016 .

[15]  Véronique Hoste,et al.  Towards an Improved Methodology for Automated Readability Prediction , 2010, LREC.

[16]  Jennifer Pearson,et al.  Using Parallel Texts in the Translator Training Environment , 2014 .

[17]  Ana Frankenberg-Garcia Training translators to use corpora hands-on , 2015 .

[18]  Thomas François,et al.  Do NLP and machine learning improve traditional readability formulas? , 2012, PITR@NAACL-HLT.

[19]  Stig Johansson,et al.  Seeing through Multilingual Corpora , 2007 .

[20]  Bert Cappelle,et al.  English is less rich in manner-of-motion verbs when translated from French , 2012 .

[21]  Isabelle Delaere,et al.  Empirical Translation Studies: New methodological and theoretical traditions , 2017 .

[22]  Pilar Sánchez-Gijón Developing documentation skills to build do-it-yourself corpora in the specialised translation course , 2009 .

[23]  Mona Baker,et al.  'Corpus Linguistics and Translation Studies: Implications and Applications' , 1993 .

[24]  G. R. Yepes,et al.  PARALLEL CORPORA IN TRANSLATOR EDUCATION GUADALUPE RUIZ YEPES , 2012 .

[25]  Koen Plevoets,et al.  Is translated language more standardized than non-translated language?: Using profile-based correspondence analysis for measuring linguistic distances between language varieties. , 2012 .

[26]  Lynne Bowker Exploring the Potential of Corpora for Raising Language Awareness in Student Translators , 1999 .

[27]  Malcolm Williams,et al.  Translation Quality Assessment , 2009 .

[28]  Jennifer Pearson,et al.  Working with Specialized Language: A Practical Guide to Using Corpora , 2002 .

[29]  Robert J. Hartsuiker,et al.  Translation Methods and Experience: A Comparative Analysis of Human Translation and Post-editing with Students and Professional Translators , 2017 .

[30]  Philippe Anckaert,et al.  Pour Une Évaluation Normative De La Compétence De Traduction , 2008 .

[31]  Katell Hernandez Morin,et al.  Du contexte didactique aux pratiques professionnelles : proposition d’une grille multicritères pour l’évaluation de la qualité en traduction spécialisée , 2014 .

[32]  A. Chesterman Beyond the particular , 2004 .

[33]  Lynne Bowker Using specialized monolingual native-language corpora as a translation resource: a pilot study: a pilot study , 1998 .

[34]  Lynne Bowker,et al.  A Corpus-Based Approach to Evaluating Student Translations , 2000 .

[35]  Ghodrat Hassani,et al.  A Corpus-Based Evaluation Approach to Translation Improvement , 2011 .

[36]  Orphée De Clercq,et al.  Dutch Parallel Corpus: A Balanced Copyright-Cleared Parallel Corpus , 2011 .