One Sense Per Discourse for Synonym Detection

In this paper, we present a new methodology for synonym detection based on the combination of global and local distributional similarities of pairs of words. The methodology is evaluated on the noun space of the 50 multiple-choice synonym questions taken from the ESL and reaches 91.30% accuracy using a conditional probabilistic model associated with the cosine similarity measure.

[1]  Robert L. Goldstone,et al.  Similarity Involving Attributes and Relations: Judgments of Similarity and Difference Are Not Inverses , 1990 .

[2]  David Yarowsky,et al.  One Sense Per Discourse , 1992, HLT.

[3]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[4]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[5]  Magnus Sahlgren,et al.  Vector-Based Semantic Analysis Using Random Indexing for Cross-Lingual Query Expansion , 2001, CLEF.

[6]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[7]  James R. Curran,et al.  Improvements in Automatic Thesaurus Extraction , 2002, ACL 2002.

[8]  Jeffrey P. Bigham,et al.  Combining independent modules in lexical multiple-choice problems , 2004, RANLP.

[9]  Stan Szpakowicz,et al.  Roget's thesaurus and semantic similarity , 2012, RANLP.

[10]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[11]  Charles L. A. Clarke,et al.  Frequency Estimates for Statistical Word Similarity Measures , 2003, NAACL.

[12]  Reinhard Rapp Utilizing the One-Sense-per-Discourse Constraint for Fully Unsupervised Word Sense Induction and Disambiguation , 2004, LREC.

[13]  David J. Weir,et al.  Characterising Measures of Lexical Distributional Similarity , 2004, COLING.

[14]  Edmond Chow,et al.  New Experiments in Distributional Representations of Synonymy , 2005, CoNLL.

[15]  Magnus Sahlgren Towards pertinent evaluation methodologies for word-space models , 2006, LREC.