Semantics-free, word-based information retrieval is thwarted by two complementary problems. First, search for relevant documents returns irrelevant items when all meanings of a search term are used, rather than just the meaning intended. This causes low precision. Second, relevant items are missed when they are indexed not under the actual search terms, but rather under related terms. This causes low recall. With semantics-free approaches there is generally no way to improve both precision and recall at the same time. Word sense disambiguation during document indexing should improve precision. We have investigated using the massive Word Net semantic network for disambigu at ion during indexing. With the unconstrained text of the SMART ret rieval environment, we have had to derive our own content description from the input text, given only part-ofspeech tagging of the input. We employ the notion of semantic distance between network nodes. Input text terms with multiple senses are disambiguated by finding the combination of senses from a set of contiguous terms which minimizes total pairwise dist ante between senses. Results so far have been encouraging. Improvement in disamblguation compared with chance is clear
[1]
Chris Buckley,et al.
Implementation of the SMART Information Retrieval System
,
1985
.
[2]
George A. Miller,et al.
Nouns in WordNet: A Lexical Inheritance System
,
1990
.
[3]
Michael McGill,et al.
Introduction to Modern Information Retrieval
,
1983
.
[4]
Jin H. Kim,et al.
A Model of Knowledge Based Information Retrieval with Hierarchical Concept Graph
,
1990,
J. Documentation.
[5]
A. Tversky.
Features of Similarity
,
1977
.
[6]
George A. Miller,et al.
Introduction to WordNet: An On-line Lexical Database
,
1990
.
[7]
Michael E. Lesk,et al.
Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone
,
1986,
SIGDOC '86.
[8]
W. Bruce Croft,et al.
Lexical ambiguity and information retrieval
,
1992,
TOIS.
[9]
Roy Rada,et al.
Development and application of a metric on semantic nets
,
1989,
IEEE Trans. Syst. Man Cybern..