Resolving Translation Ambiguity using Non-parallel Bilingual Corpora

This paper presents an unsupervised method for choosing the correct translation of a word in context. It learns disambiguation information from nonparallel bilinguM corpora (preferably in the same domain) free from tagging. Our method combines two existing unsupervised disambiguation algorithms: a word sense disambiguation algorithm based on distributional clustering and a translation disambiguation algorithm using target language corpora. For the given word in context, the former algorithm identifies its meaning as one of a number of predefined usage classes derived by clustering a large amount of usages in the source language corpus. The latter algorithm is responsible for associating each usage class (i.e., cluster) with a target word that is most relevant to the usage. This paper also shows preliminary results of translation experiments.