Representations, Models and Abstractions in Probabilistic Information Retrieval

We show that most approaches in probabilistic information retrieval can be regarded as a combination of the three concepts representation, model and abstraction. First, documents and queries have to be represented in a certain form, e.g. as a sets of terms. Probabilistic models use certain assumptions about the distribution of the elements of the representation in relevant and nonrelevant documents in order to estimate the probability of relevance of a document w.r.t. a query. Older approaches based on query-specific relevance feedback are restricted to simple representations and models. Using abstractions from specific documents, terms and queries, more powerful approaches can be realized.

[1]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[2]  Norbert Fuhr,et al.  Optimum probability estimation from empirical distributions , 1989, Inf. Process. Manag..

[3]  Norbert Fuhr,et al.  The automatic indexing system AIR/PHYS - from research to applications , 1988, SIGIR '88.

[4]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[5]  Clement T. Yu,et al.  A clustered search algorithm incorporating arbitrary term dependencies , 1982, TODS.

[6]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[7]  Norbert Fuhr,et al.  Combining model-oriented and description-oriented approaches for probabilistic indexing , 1991, SIGIR '91.

[8]  Clement T. Yu,et al.  An Evaluation of Term Dependence Models in Information Retrieval , 1982, SIGIR.

[9]  W. Bruce Croft,et al.  I3R: A new approach to the design of document retrieval systems , 1987, J. Am. Soc. Inf. Sci..

[10]  W. Bruce Croft,et al.  I 3 R: a new approach to the design of document retrieval systems , 1987 .

[11]  Norbert Fuhr,et al.  Models for retrieval with probabilistic indexing , 1989, Inf. Process. Manag..

[12]  Norbert Fuhr,et al.  Optimum polynomial retrieval functions based on the probability ranking principle , 1989, TOIS.

[13]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[14]  S. K. M. Wong,et al.  A Generalized Binary Probabilistic Independence Model. , 1990 .

[15]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[16]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[17]  William S. Cooper,et al.  Some inconsistencies and misnomers in probabilistic information retrieval , 1991, SIGIR '91.

[18]  Annelise Mark Pejtersen A Library System for Information Retrieval Based on a Cognitive Task Analysis and Supported by an Icon-Based Interface , 1989, SIGIR.