Probabilistic Retrieval Incorporating the Relationships of Descriptors Incrementally

The previous probabilistic retrieval models assume that the relevance probability of a document is independent of the descriptors that are not specified in a query. This is not true in practice because there can be many descriptors that represent the same concept. The probabilistic retrieval model developed in this paper overcomes this unsuitable assumption and incorporates the relationships of descriptors. A learning method is also proposed to figure out the relationships incrementally. Each time retrieval results are available, the method identifies in the relevant documents the descriptors that designate the concepts specified by the query descriptors. Although it employs user feedbacks like relevance feedback, it attempts to capture certain stable relationships of descriptors from many past user queries rather than to distinguish relevant documents from non-relevant ones for a particular query. We show through experiments that the proposed information retrieval method improves retrieval effectiveness over time.

[1]  Alan F. Smeaton,et al.  The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System , 1983, Comput. J..

[2]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[3]  W. Bruce Croft,et al.  Relevance feedback and inference networks , 1993, SIGIR.

[4]  Norbert Fuhr,et al.  Optimum probability estimation from empirical distributions , 1989, Inf. Process. Manag..

[5]  Key-Sun Choi,et al.  Automatic Thesaurus Construction Using Bayesian Networks , 1996, Inf. Process. Manag..

[6]  Fredric C. Gey,et al.  Probabilistic retrieval based on staged logistic regression , 1992, SIGIR '92.

[7]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[8]  Gerard Salton,et al.  The estimation of Term Relevance weights using Relevance feedback , 1981, J. Documentation.

[9]  Robert M. Losee,et al.  Parameter Estimation for Probabilistic Document-Retrieval Models. , 1988 .

[10]  William S. Cooper,et al.  Some inconsistencies and misnomers in probabilistic information retrieval , 1991, SIGIR '91.

[11]  Donna K. Harman,et al.  Relevance feedback revisited , 1992, SIGIR '92.

[12]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[13]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[14]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[15]  Norbert Fuhr,et al.  Models for retrieval with probabilistic indexing , 1989, Inf. Process. Manag..

[16]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[17]  Myoung-Ho Kim,et al.  Information Retrieval Based on Conceptual Distance in is-a Hierarchies , 1993, J. Documentation.

[18]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.