PIC matrices: a computationally tractable class of probabilistic query operators

The inference network model of information retrieval allows a probabilistic interpretation of query operators. In particular, Boolean query operators are conveniently modeled as link matrices of the Bayesian Network. Prior work has shown, however, that these operators do not perform as well as the pnorm operators used for modeling query operators in the context of the vector space model. This motivates the search for alternative probabilistic formulations for these operators. The design of such alternatives must contend with the issue of computational tractability, since the evaluation of an arbitrary operator requires exponential time. We define a flexible class of link matrices that are natural candidates for the implementation of query operators and an O(n2) algorithm (n = the number of parent nodes) for the computation of probabilities involving link matrices of this class. We present experimental results indicating that Boolean operators implemented in terms of link matrices from this class perform as well as pnorm operators in the context of the INQUERY inference network.

[1]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[2]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[3]  Abraham Bookstein,et al.  Fuzzy requests: An approach to weighted boolean searches , 1980, J. Am. Soc. Inf. Sci..

[4]  Howard R. Turtle Natural language vs. Boolean query evaluation: a comparison of retrieval performance , 1994, SIGIR '94.

[5]  Edward A. Fox,et al.  Research Contributions , 2014 .

[6]  Edward A. Fox,et al.  Automatic query formulations in information retrieval , 1983, J. Am. Soc. Inf. Sci..

[7]  Nicholas J. Belkin,et al.  The effect multiple query representations on information retrieval system performance , 1993, SIGIR.

[8]  Stephen E. Robertson,et al.  Large Test Collection Experiments on an Operational, Interactive System: Okapi at TREC , 1995, Inf. Process. Manag..

[9]  Donald H. Kraft,et al.  A mathematical model of a weighted boolean retrieval system , 1979, Inf. Process. Manag..

[10]  E. A. Fox,et al.  Combining the Evidence of Multiple Query Representations for Information Retrieval , 1995, Inf. Process. Manag..

[11]  W. Bruce Croft,et al.  The INQUERY Retrieval System , 1992, DEXA.

[12]  William S. Cooper The formalism of probability theory in IR: a foundation or an encumbrance? , 1994, SIGIR '94.

[13]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[14]  J. Lee Analyzing the Effectiveness of Extended Boolean Models in Information Retrieval , 1995 .

[15]  S. Venit,et al.  Numerical Analysis: A Second Course. , 1974 .

[16]  W. Bruce Croft,et al.  Inference networks for document retrieval , 1989, SIGIR '90.

[17]  Edward A. Fox,et al.  Extended Boolean Models , 1992, Information retrieval (Boston).

[18]  Warren R. Greiff Computationally Tractable, Conceptually Plausible Classes of Link Matrices for the Inquery Inference Network , 1996 .

[19]  J. Ortega Numerical Analysis: A Second Course , 1974 .

[20]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[21]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[22]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[23]  Eugene Charniak,et al.  Bayesian Networks without Tears , 1991, AI Mag..

[24]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[25]  Donald H. Kraft,et al.  Threshold values and Boolean retrieval systems , 1981, Inf. Process. Manag..

[26]  Terry Noreault,et al.  Automatic ranked output from boolean searches in SIRE , 1977, J. Am. Soc. Inf. Sci..

[27]  Abraham Bookstein,et al.  A comparison of two systems of weighted boolean retrieval , 1981, J. Am. Soc. Inf. Sci..