A critical investigation of recall and precision as measures of retrieval system performance

Recall and precision are often used to evaluate the effectiveness of information retrieval systems. They are easy to define if there is a single query and if the retrieval result generated for the query is a linear ordering. However, when the retrieval results are weakly ordered, in the sense that several documents have an identical retrieval status value with respect to a query, some probabilistic notion of precision has to be introduced. Relevance probability, expected precision, and so forth, are some alternatives mentioned in the literature for this purpose. Furthermore, when many queries are to be evaluated and the retrieval results averaged over these queries, some method of interpolation of precision values at certain preselected recall levels is needed. The currently popular approaches for handling both a weak ordering and interpolation are found to be inconsistent, and the results obtained are not easy to interpret. Moreover, in cases where some alternatives are available, no comparative analysis that would facilitate the selection of a particular strategy has been provided. In this paper, we systematically investigate the various problems and issues associated with the use of recall and precision as measures of retrieval system performance. Our motivation is to provide a comparative analysis of methods available for defining precision in a probabilistic sense and to promote a better understanding of the various issues involved in retrieval performance evaluation.

[1]  R. A. Fox,et al.  Introduction to Mathematical Statistics , 1947 .

[2]  DAVID G. KENDALL,et al.  Introduction to Mathematical Statistics , 1947, Nature.

[3]  N. Perry,et al.  Book Reviews : Introduction to Mathematical Statistics (2nd Ed.), by Paul G. Hoel. New York: John Wiley and Sons, Inc., I954. Pp. xi + 33I. $5.00 , 1955 .

[4]  Patrick Suppes,et al.  Introduction To Logic , 1958 .

[5]  W. S. Cooper Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems , 1968 .

[6]  Michael E. Lesk,et al.  Computer Evaluation of Indexing and Text Processing , 1968, JACM.

[7]  John A. Swets,et al.  Effectiveness of information retrieval methods , 1969 .

[8]  Stephen E. Robertson,et al.  THE PARAMETRIC DESCRIPTION OF RETRIEVAL TESTS: PART I: THE BASIC PARAMETERS , 1969 .

[9]  Gerard Salton,et al.  Evaluation problems in interactive information retrieval , 1969, Inf. Storage Retr..

[10]  C. W. Cleverdon Evaluation Tests of Information Retrieval Systems , 1970 .

[11]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[12]  C. Cleverdon On the Inverse Relationship of Recall and Precision. , 1972 .

[13]  William S. Cooper,et al.  On selecting a measure of retrieval effectiveness part II. Implementation of the philosophy , 1973, J. Am. Soc. Inf. Sci..

[14]  M. H. Heine Distance between sets as an objective measure of retrieval effectiveness , 1973, Inf. Storage Retr..

[15]  William S. Cooper,et al.  On selecting a measure of retrieval effectiveness , 1973, J. Am. Soc. Inf. Sci..

[16]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[17]  Clement T. Yu,et al.  Contribution to the Theory of Indexing , 1973, IFIP Congress.

[18]  C. J. van Rijsbergen,et al.  FOUNDATION OF EVALUATION , 1974 .

[19]  William Cooper,et al.  A General Mathematical Model for Information Retrieval Systems , 1976, The Library Quarterly.

[20]  Clement T. Yu,et al.  Precision Weighting—An Effective Automatic Indexing Method , 1976, J. ACM.

[21]  Vijay V. Raghavan,et al.  Single-pass method for determining the semantic relationships between terms , 1977, J. Am. Soc. Inf. Sci..

[22]  Donald H. Kraft,et al.  Evaluation of information retrieval systems: A decision theory approach , 1978, J. Am. Soc. Inf. Sci..

[23]  Donald H. Kraft,et al.  Stopping rules and their effect on expected search length , 1979, Inf. Process. Manag..

[24]  G. Reinsel,et al.  Introduction to Mathematical Statistics (4th ed.). , 1980 .

[25]  Peter Bollmann-Sdorra,et al.  Measurement-theoretical investigation of the MZ-metric , 1980, SIGIR '80.

[26]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[27]  Chris Buckley,et al.  Implementation of the SMART Information Retrieval System , 1985 .

[28]  Gerard Salton,et al.  Recent trends in automatic information retrieval , 1986, SIGIR '86.

[29]  Vijay V. Raghavan,et al.  A utility-theoretic analysis of expected search length , 1988, SIGIR '88.

[30]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..