Unsupervised Corpus-Based Methods for WSD

This chapter focuses on unsupervised corpus-based methods of word sense discrimination that are knowledge-lean, and do not rely on external knowledge sources such as machine readable dictionaries, concept hierarchies, or sense-tagged text. They do not assign sense tags to words; rather, they discriminate among word meanings based on information found in unannotated corpora. This chapter reviews distributional approaches that rely on monolingual corpora and methods based on translational equivalence as found in word-aligned parallel corpora. These techniques are organized into typeand token-based approaches. The former identify sets of related words, while the latter distinguish among the senses of a word used in multiple contexts.

[1]  David Yarowsky,et al.  One Sense per Collocation , 1993, HLT.

[2]  Ted Pedersen,et al.  Word Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces , 2004, CoNLL.

[3]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[4]  Philipp Koehn,et al.  Feature-Rich Statistical Translation of Noun Phrases , 2003, ACL.

[5]  Patrick Hanks,et al.  Do Word Meanings Exist? , 2000, Comput. Humanit..

[6]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[7]  Ted Pedersen,et al.  Knowledge Lean Word-Sense Disambiguation , 1997, AAAI/IAAI.

[8]  Ted Pedersen,et al.  The Senseval-3 Multilingual English-­Hindi lexical sample task , 2004, SENSEVAL@ACL.

[9]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[10]  Curt Burgess,et al.  Modelling Parsing Constraints with High-dimensional Context Space , 1997 .

[11]  Ted Pedersen,et al.  Distinguishing Word Senses in Untagged Text , 1997, EMNLP.

[12]  Curt Burgess,et al.  The Dynamics of Meaning in Memory , 1998 .

[13]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[14]  Hwee Tou Ng,et al.  Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study , 2003, ACL.

[15]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[16]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[17]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[18]  Alon Itai,et al.  Two Languages Are More Informative Than One , 1991, ACL.

[19]  Eneko Agirre,et al.  Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation , 1997, ACL.

[20]  Philip Resnik,et al.  Selectional Preference and Sense Disambiguation , 1997 .

[21]  Luc De Raedt,et al.  Proceedings of the 12th European Conference on Machine Learning , 2001 .

[22]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[23]  Marine Carpuat,et al.  Word Sense Disambiguation vs. Statistical Machine Translation , 2005, ACL.

[24]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[25]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[26]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[27]  Adam Kilgarriff,et al.  "I Don’t Believe in Word Senses" , 1997, Comput. Humanit..

[28]  L. Mcquitty Similarity Analysis by Reciprocal Pairs for Discrete and Continuous Data , 1966 .

[29]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[30]  Patrick Pantel,et al.  Concept Discovery from Text , 2002, COLING.

[31]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[32]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[33]  Kenneth Ward Church,et al.  Using bilingual materials to develop word sense disambiguation methods , 1992, TMI.

[34]  Ted Pedersen,et al.  An Evaluation Exercise for Word Alignment , 2003, ParallelTexts@NAACL-HLT.

[35]  Philip Resnik,et al.  A Perspective on Word Sense Disambiguation Methods and Their Evaluation , 2002 .

[36]  Ellen M. Voorhees,et al.  Corpus-Based Statistical Sense Resolution , 1993, HLT.

[37]  Janyce Wiebe,et al.  Word-Sense Disambiguation Using Decomposable Models , 1994, ACL.

[38]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[39]  Joel D. Martin,et al.  Word Alignment for Languages with Scarce Resources , 2005, ParallelText@ACL.