Exact Expected Average Precision of the Random Baseline for System Evaluation

Abstract Average precision (AP) is one of the most widely used metrics in information retrieval and natural language processing research. It is usually thought that the expected AP of a system that ranks documents randomly is equal to the proportion of relevant documents in the collection. This paper shows that this value is only approximate, and provides a procedure for efficiently computing the exact value. An analysis of the difference between the approximate and the exact value shows that the discrepancy is large when the collection contains few documents, but becomes very small when it contains at least 600 documents.

[1]  Stephen E. Robertson,et al.  A new interpretation of average precision , 2008, SIGIR '08.

[2]  Ellen M. Voorhees,et al.  Overview of the Seventh Text REtrieval Conference , 1998 .

[3]  Carlos Ramisch,et al.  An Evaluation of Methods for the Extraction of Multiword Expressions , 2008, LREC 2008.

[4]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[5]  Cordelia Schmid,et al.  Actions in context , 2009, CVPR.

[6]  Nicole Bauer,et al.  Information Retrieval Implementing And Evaluating Search Engines , 2016 .

[7]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[8]  Hirokazu Kameoka,et al.  Automatic video annotation via Hierarchical Topic Trajectory Model considering cross-modal correlations , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Shih-Fu Chang,et al.  Closing the loop in cortically-coupled computer vision: a brain–computer interface for searching image databases , 2011, Journal of neural engineering.

[10]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[11]  Fernando De la Torre,et al.  Facing Imbalanced Data--Recommendations for the Use of Performance Metrics , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[12]  Pavel Pecina,et al.  Lexical association measures and collocation extraction , 2009, Lang. Resour. Evaluation.