The measures precision, recall, fallout and miss as a function of the number of retrieved documents and their mutual interrelations

In this paper, for the first time, we present global curves for the measures precision, recall, fallout and miss in function of the number of retrieved documents. Different curves apply for different retrieved systems, for which we give exact definitions in terms of a retrieval density function: perverse retrieval, perfect retrieval, random retrieval, normal retrieval, hereby extending results of Buckland and Gey and of Egghe in the following sense: mathematically more advanced methods yield a better insight into these curves, more types of retrieval are considered and, very importantly, the theory is developed for the ''complete'' set of measures: precision, recall, fallout and miss. Next we study the interrelationships between precision, recall, fallout and miss in these different types of retrieval, hereby again extending results of Buckland and Gey (incl. a correction) and of Egghe. In the case of normal retrieval we prove that precision in function of recall and recall in function of miss is a concavely decreasing relationship while recall in function of fallout is a concavely increasing relationship. We also show, by producing examples, that the relationships between fallout and precision, miss and precision and miss and fallout are not always convex or concave.

[1]  Leo Egghe,et al.  A Theoretical Study of Recall and Precision Using a Topological Approach to Information Retrieval , 1998, Inf. Process. Manag..

[2]  Jean Tague-Sutcliffe,et al.  Measuring information : an information services perspective , 1995 .

[3]  Robert A. Adams,et al.  Calculus: A Complete Course , 1994 .

[4]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[5]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[6]  Friedrich Gebhardt,et al.  A simple probabilistic model for the relevance assessment of documents , 1975, Inf. Process. Manag..

[7]  正好 長谷川 Information Processing and Management:[8]Patent Information , 1984 .

[8]  Robert M. Losee Text retrieval and filtering: analytic models of performance , 1998 .

[9]  Leo Egghe A universal method of information retrieval evaluation: the "missing" link M and the universal IR surface , 2004, Inf. Process. Manag..

[10]  Fredric C. Gey,et al.  The relationship between recall and precision , 1994 .

[11]  Leo Egghe Existence theorem of the quadruple (P, R, F, M): Precision, recall, fallout and miss , 2007, Inf. Process. Manag..

[12]  Ophir Frieder,et al.  Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval) , 2004 .

[13]  Davis B. McCarn,et al.  A mathematical model of retrieval system performance , 1990, J. Am. Soc. Inf. Sci..

[14]  Stephen E. Robertson,et al.  Explicit and implicit variables in information retrieval (IR) systems , 1975, J. Am. Soc. Inf. Sci..

[15]  Michael D. Gordon,et al.  Recall-precision trade-off: A derivation , 1989, JASIS.

[16]  Leo Egghe Qualitative analysis of the recall-precision relationship in information retrieval , 1992 .

[17]  R. Goodstein,et al.  Differential and Integral Calculus , 1947 .

[18]  Cheng-Shang Chang Calculus , 2020, Bicycle or Unicycle?.

[19]  Donald H. Kraft,et al.  Measurement in Information Science , 1994 .

[20]  Jacob Shapiro,et al.  Automated information retrieval - theory and methods , 1997, Library and information science series.

[21]  C. Cleverdon On the Inverse Relationship of Recall and Precision. , 1972 .

[22]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[23]  Robert M. Losee,et al.  Text Retrieval and Filtering , 1998, The Information Retrieval Series.

[24]  Penelope A. Yates-Mercer,et al.  RELATIONAL INDEXING APPLIED TO THE SELECTIVE DISSEMINATION OF INFORMATION , 1976 .

[25]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[26]  Ophir Frieder,et al.  Information Retrieval: Algorithms and Heuristics , 1998 .