Corpus-driven study of translation units in an English-Chinese parallel corpus

It is widely known that texts are not translated word by word but in larger units which are, from the perspective of the target language, more or less monosemous. This dissertation argues that translation units are the smallest such units, and that they can be identified in parallel corpora. It aims to show that these translation units and their target language equivalents can be extracted from parallel corpora and can be re-used to facilitate new translations. The concept of translation units and their equivalents will enable translators to translate competently into languages other than their native language, something not sufficiently supported by traditional bilingual dictionaries. For my exploratory study presented here, I will use the Hong Kong Legal Document Parallel Corpus (HKLDC). This dissertation starts with the definition of the concept of the translation unit and its equivalent and goes on to describe a method of extracting translation unit candidates. These candidates are then validated by further analysis. It will also test the hypothesis that each complete translation unit has only one translation equivalent. Finally, by comparing the translation equivalents extracted from the corpus with those provided by traditional dictionaries, this dissertation will argue that parallel corpora, as the repository of the translation units and translation equivalents, can, by complementing traditional translation aids, facilitate translation.

[1]  Mona Baker,et al.  'Corpus Linguistics and Translation Studies: Implications and Applications' , 1993 .

[2]  Jarle Ebeling,et al.  Contrastive Linguistics, Translation, and Parallel Corpora , 1998 .

[3]  Lynne Bowker Using specialized monolingual native-language corpora as a translation resource: a pilot study: a pilot study , 1998 .

[4]  Dan Tufis,et al.  Computational bilingual lexicography: automatic extraction of translation dictionaries , 2001 .

[5]  M. Stubbs British Traditions in Text Analysis — From Firth to Sinclair , 1993 .

[6]  Mona Baker 'Corpora in Translation Studies: An Overview and Some Suggestions for Future Research' , 1995 .

[7]  J. Firth,et al.  Papers in linguistics, 1934-1951 , 1957 .

[8]  J. R. Firth,et al.  THE TECHNIQUE OF SEMANTICS. , 1935 .

[9]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[10]  J. Aitchison Words in the Mind: An Introduction to the Mental Lexicon , 1987 .

[11]  Elena Tognini-Bonelli,et al.  Corpus Linguistics at Work , 2002, Computational Linguistics.

[12]  J. Firth,et al.  Selected papers of J. R. Firth, 1952-59 , 1968 .

[13]  J. Sinclair The lexical item , 1998 .

[14]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[15]  C. Goddard Words and Phrases: Corpus Studies of Lexical Semantics , 2006 .

[16]  R. Moon Fixed Expressions and Idioms in English: A Corpus-Based Approach , 1998 .

[17]  Dekai Wu Grammarless extraction of phrasal translation examples from parallel texts , 1995 .

[18]  Mona Baker,et al.  Corpus-based Translation Studies: The Challenges that Lie Ahead , 1996 .

[19]  Dirk Geeraerts,et al.  2.2 Meaning and definition , 2003 .

[20]  Wiebke Ramm Sentence boundary adjustments in Norwegian-German and German-Norwegian translations: First results of a corpus-based study , 2004 .

[21]  Chunyu Kit,et al.  Clause alignment for Hong Kong legal texts: A lexical-based approach , 2004 .

[22]  Ronald Carter,et al.  Trust the Text: Language, Corpus and Discourse , 2004 .

[23]  John Sinclair,et al.  Reviews of The Longman Grammar of Spoken and Written English , 2001 .

[24]  G. Francis A Corpus-Driven Approach to Grammar — Principles, Methods and Examples , 1993 .

[25]  Ladislav Zgusta,et al.  Translational equivalence in the bilingual dictionary , 1983 .

[26]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[27]  Satoshi Sato,et al.  Toward Memory-based Translation , 1990, COLING.

[28]  J. Sinclair Trust the text , 2002 .

[29]  J. D. Gallagher Mona Baker: In Other Words. A course-book on translation , 1994 .

[30]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[31]  Igor Burkhanov,et al.  2.4 Pragmatic specifications: Usage indications, labels, examples; dictionaries of style, dictionaries of collocations , 2003 .

[32]  Satoshi Sato,et al.  Finding Translation Correspondences from Parallel Parsed Corpus for Example-based Translation , 2001 .

[33]  Graeme D. Kennedy,et al.  Book Reviews: An Introduction to Corpus Linguistics , 1999, CL.

[34]  Wolfgang Teubert,et al.  The role of parallel corpora in translation and multilingual lexicography , 2002 .

[35]  D. Biber,et al.  Longman Grammar of Spoken and Written English , 1999 .

[36]  John Lyons,et al.  Introduction to Theoretical Linguistics , 1971 .

[37]  Susan Hunston,et al.  Corpora in Applied Linguistics , 2002 .

[38]  Susan Hunston,et al.  Book Reviews: Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English , 2000, CL.

[39]  Wolfgang Teubert Comparable or Parallel Corpora , 1996 .

[40]  Yuji Matsumoto,et al.  A Comparative Study on Translation Units for Bilingual Lexicon Extraction , 2001, DDMMT@ACL.

[41]  Wolfgang Teubert,et al.  Corpus Linguistics and Lexicography , 2001 .

[42]  John Sinclair,et al.  Corpus, Concordance, Collocation , 1991 .

[43]  J. Sinclair The Search for Units of Meaning , 1996 .

[44]  Dorothy Kenny,et al.  Lexis and creativity in translation : a corpus-based study , 2001 .

[45]  W. Teubert My version of corpus linguistics , 2005 .