论文信息 - Discovering the Senses of an Ambiguous Word by Clustering its Local Contexts

Discovering the Senses of an Ambiguous Word by Clustering its Local Contexts

As has been shown recently, it is possible to automatically discover the senses of an ambiguous word by statistically analyzing its contextual behavior in a large text corpus. However, this kind of research is still at an early stage. The results need to be improved and there is considerable disagreement on methodological issues. For example, although most researchers use clustering approaches for word sense induction, it is not clear what statistical features the clustering should be based on. Whereas so far most researchers cluster global co-occurrence vectors that reflect the overall behavior of a word in a corpus, in this paper we argue that it is more appropriate to use local context vectors. We support our view by comparing both approaches and by discussing their strengths and weaknesses.

Reinhard Rapp

[1] T. Landauer,et al. A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[2] Dominic Widdows,et al. Discovering Corpus-Specific Word Senses , 2003, EACL.

[3] Patrick Pantel,et al. Discovering word senses from text , 2002, KDD.

[4] Reinhard Rapp,et al. Mining Text for Word Senses Using Independent Component Analysis , 2004, SDM.

[5] Hinrich Schütze,et al. Ambiguity resolution in language learning , 1997 .

[6] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[7] Reinhard Rapp,et al. The Computation of Word Associations: Comparing Syntagmatic and Paradigmatic Approaches , 2002, COLING.

[8] Hinrich Schütze,et al. Ambiguity resolution in language learning - computational and cognitive models , 1997, CSLI lecture notes series.