An Association Thesaurus for Information Retrieval

Although commonly used in both commercial and experimental information retrieval systems, thesauri have not demonstrated consistent benefits for retrieval performance, and it is difficult to construct a thesaurus automatically for large text databases. In this paper, an approach, called PhraseFinder, is proposed to construct collection-dependent association thesauri automatically using large full-text document collections. The association thesaurus can be accessed through natural language queries in INQUERY, an information retrieval system based on the probabilistic inference network. Experiments are conducted in INQUERY to evaluate different types of association thesauri, and thesauri constructed for a variety of collections.

[1]  Clement T. Yu,et al.  An Evaluation of Term Dependence Models in Information Retrieval , 1982, SIGIR.

[2]  Jack Minker,et al.  An evaluation of query expansion by the addition of clustered terms for a document retrieval system , 1972, Inf. Storage Retr..

[3]  Alan F. Smeaton,et al.  The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System , 1983, Comput. J..

[4]  Mary Hart,et al.  Automatic indexing using selective NLP and first-order thesauri , 1991, RIAO.

[5]  W. Bruce Croft,et al.  An evaluation of query processing strategies using the TIPSTER collection , 1993, SIGIR.

[6]  C. J. van Rijsbergen,et al.  The selection of good search terms , 1981, Inf. Process. Manag..

[7]  G. Salton,et al.  A Generalized Term Dependence Model in Information Retrieval , 1983 .

[8]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[9]  Karen Sparck Jones Automatic keyword classification for information retrieval , 1971 .

[10]  Donna K. Harman,et al.  Overview of the first TREC conference , 1993, SIGIR.

[11]  Karen Spärck Jones,et al.  The use of automatically-obtained keyword classifications for information retrieval , 1969, Inf. Storage Retr..

[12]  Clement T. Yu,et al.  A framework for effective retrieval , 1989, ACM Trans. Database Syst..

[13]  Donna Harman,et al.  Overview of the First Text REtrieval Conference. , 1993, SIGIR 1993.

[14]  Carolyn J. Crouch,et al.  An approach to the automatic construction of global thesauri , 1990, Inf. Process. Manag..

[15]  Gerard Salton,et al.  Automatic term class construction using relevance--A summary of work in automatic pseudoclassification , 1980, Inf. Process. Manag..

[16]  Robert Krovetz,et al.  Viewing morphology as an inference process , 1993, Artif. Intell..

[17]  W. Bruce Croft,et al.  Inference networks for document retrieval , 1989, SIGIR '90.

[18]  Carolyn J. Crouch,et al.  Experiments in automatic statistical thesaurus construction , 1992, SIGIR '92.

[19]  W. Bruce Croft,et al.  The use of phrases and structured queries in information retrieval , 1991, SIGIR '91.

[20]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[21]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[22]  Karen Spärck Jones,et al.  Automatic term classifications and retrieval , 1968, Inf. Storage Retr..

[23]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..