Identifying universals of text translation*

Abstract Straightforward quantitative analyses of authentic texts have allowed linguists and translation scholars to discern patterns in individual languages as well as features which set translations apart from originals (Baker, 1993; Chesterman, 2004). A language can also be studied statistically, an approach epitomized by the application of Zipf's Law (Zipf, 1949), which states that word-frequency distributions follow an almost identical curve regardless of language. To date, no universal behaviour governing the joint probability distribution of words in two or more languages has been either proposed or observed. This study identifies new universals which characterize the mutual overlaps between a corpus of original English and three corpora of translated English. Specifically, it suggests a remarkable similarity in (a) the number of types unique to each translated corpus, and (b) the number of types common to the original-English corpus and each of the translated corpora. We argue that these universal behaviours can be used both to determine the ontological status of an unidentified language (whether it is an original or a translation) and to identify the source language of a translation.

[1]  D. Kenny Creatures of Habit? What Transla-tors Usually Do with Words , 1998 .

[2]  George Carayannis,et al.  Word Length, Word Frequencies and Zipf’s Law in the Greek Language , 2001, J. Quant. Linguistics.

[3]  Mona Baker,et al.  'Corpus Linguistics and Translation Studies: Implications and Applications' , 1993 .

[4]  J. Sutherland The Quark and the Jaguar , 1994 .

[5]  Kanter,et al.  Markov processes: Linguistics and Zipf's law. , 1995, Physical review letters.

[6]  S.-W. Choi Some Statistical Properties and Zipf’s Law in Korean Text Corpus , 2000, J. Quant. Linguistics.

[7]  G. Mann The Quark and the Jaguar: adventures in the simple and the complex , 1994 .

[8]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[9]  J. House,et al.  Shifts of Cohesion and Coherence in Translation , 1996 .

[10]  Ido Kanter,et al.  Markov processes and linguistics , 1998 .

[11]  S. Blum,et al.  UNIVERSALS OF LEXICAL SIMPLIFICATION , 1978 .

[12]  Juliane House,et al.  Interlingual and intercultural communication : discourse and cognition in translation and second language acquisition studies , 1986 .

[13]  S. Tirkkonen-Condit Unique items — over- or under-represented in translated language? , 2004 .

[14]  R. Mantegna,et al.  Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[15]  Jarmo Jantunen Untypical patterns in translations: Issues on corpus methodology and synonymity , 2004 .

[16]  P-O Nilsson Investigating characteristic lexical distributions and grammatical patterning in Swedish texts translated from English , .

[17]  Rita Vanderauwera Dutch Novels Translated Into English.The Transformation of a Minority Literature. , 1985 .

[18]  A. Chesterman Hypotheses about translation universals , 2004 .