Word Sense Disambiguation Using Automatically Translated Sense Examples

We present an unsupervised approach to Word Sense Disambiguation (WSD). We automatically acquire English sense examples using an English-Chinese bilingual dictionary, Chinese monolingual corpora and Chinese-English machine translation software. We then train machine learning classifiers on these sense examples and test them on two gold standard English WSD datasets, one for binary and the other for fine-grained sense identification. On binary disambiguation, performance of our unsupervised system has approached that of the state-of-the-art supervised ones. On multi-way disambiguation, it has achieved a very good result that is competitive to other state-of-the-art unsupervised systems. Given the fact that our approach does not rely on manually annotated resources, such as sense-tagged data or parallel corpora, the results are very promising.

[1]  Ted Pedersen,et al.  A Decision Tree of Bigrams is an Accurate Predictor of Word Sense , 2001, NAACL.

[2]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.

[3]  Kenneth Ward Church,et al.  Using bilingual materials to develop word sense disambiguation methods , 1992, TMI.

[4]  Hang Li,et al.  Word Translation Disambiguation Using Bilingual Bootstrapping , 2002, ACL.

[5]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[6]  Philip Resnik,et al.  An Unsupervised Method for Word Sense Tagging using Parallel Corpora , 2002, ACL.

[7]  Xinglong Wang,et al.  Word Sense Disambiguation Using Sense Examples Automatically Acquired from a Second Language , 2005, HLT.

[8]  Eneko Agirre,et al.  The Basque Country University system: English and Basque tasks , 2004, SENSEVAL@ACL.

[9]  Eneko Agirre,et al.  Unsupervised WSD based on Automatically Retrieved Examples: The Importance of Bias , 2004, EMNLP.

[10]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.

[11]  Grace Ngai,et al.  Transformation Based Learning in the Fast Lane , 2001, NAACL.

[12]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[13]  Hwee Tou Ng,et al.  Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study , 2003, ACL.

[14]  Bernardo Magnini,et al.  Integrating Subject Field Codes into WordNet , 2000, LREC.

[15]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[16]  Hwee Tou Ng,et al.  Scaling Up Word Sense Disambiguation via Parallel Texts , 2005, AAAI.

[17]  Rada Mihalcea,et al.  The Role of Non-Ambiguous Words in Natural Language Disambiguation , 2003 .