Disambiguating Hypernym Relations for Roget's Thesaurus

Roget's Thesaurus is a lexical resource which groups terms by semantic relatedness. It is Roget's shortcoming that the relations are ambiguous, in that it does not name them; it only shows that there is a relation between terms. Our work focuses on disambiguating hypernym relations within Roget's Thesaurus. Several techniques of identifying hypernym relations are compared and contrasted in this paper, and a total of over 50,000 hypernym relations have been disambiguated within Roget's. Human judges have evaluated the quality of our disambiguation techniques, and we have demonstrated on several applications the usefulness of the disambiguated relations.

[1]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[2]  Stan Szpakowicz,et al.  The Design and Implementation of an Electronic Lexical Knowledge Base , 2001, Canadian Conference on AI.

[3]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[4]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[5]  Patrick Pantel,et al.  Automatically Labeling Semantic Classes , 2004, NAACL.

[6]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[7]  Stan Szpakowicz,et al.  Roget's thesaurus and semantic similarity , 2012, RANLP.

[8]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[9]  Eugene Charniak,et al.  Determining the specificity of nouns from text , 1999, EMNLP.

[10]  Charles L. A. Clarke,et al.  Passage retrieval vs. document retrieval for factoid question answering , 2003, SIGIR.

[11]  Jeffrey P. Bigham,et al.  Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems , 2003, ArXiv.

[12]  Peter D. Turney Similarity of Semantic Relations , 2006, CL.

[13]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[14]  Luc De Raedt,et al.  Machine Learning: ECML 2001 , 2001, Lecture Notes in Computer Science.

[15]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[16]  Makoto Nagao,et al.  Extraction of Semantic Information from an Ordinary English Dictionary and its Evaluation , 1988, COLING.

[17]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[18]  Dominic Widdows,et al.  Using LSA and Noun Coordination Information to Improve the Recall and Precision of Automatic Hyponymy Extraction , 2003, CoNLL.

[19]  George W. Davidson,et al.  Roget's Thesaurus of English Words and Phrases , 1982 .

[20]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[21]  Paul Procter,et al.  Longman Dictionary of Contemporary English , 1978 .

[22]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[23]  L. Burnard The British National Corpus , 1998 .

[24]  Sara Rydin,et al.  Building a hyponymy lexicon with hierarchical structure , 2002, ACL 2002.

[25]  W. Grove Statistical Methods for Rates and Proportions, 2nd ed , 1981 .

[26]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.