Rank-Biased Precision Reloaded: Reproducibility and Generalization

In this work we reproduce the experiments presented in the paper entitled “Rank-Biased Precision for Measurement of Retrieval Effectiveness”. This paper introduced a new effectiveness measure – Rank- Biased Precision (RBP) – which has become a reference point in the IR experimental evaluation panorama.

[1]  Carol Peters,et al.  Comparative Evaluation of Multilingual Information Access Systems , 2003, Lecture Notes in Computer Science.

[2]  Ellen M. Voorhees,et al.  Evaluation by highly relevant documents , 2001, SIGIR '01.

[3]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[4]  Student,et al.  THE PROBABLE ERROR OF A MEAN , 1908 .

[5]  Alistair Moffat,et al.  Click-based evidence for decaying weight distributions in search effectiveness metrics , 2010, Information Retrieval.

[6]  Carol Peters,et al.  CLEF 2009 Ad Hoc Track Overview: TEL and Persian Tasks , 2009, CLEF.

[7]  Ellen M. Voorhees,et al.  The fifth text REtrieval conference (TREC-5) , 1997 .

[8]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[9]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Robust Track. , 2004 .

[10]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[11]  Alistair Moffat,et al.  Users versus models: what observation tells us about effectiveness metrics , 2013, CIKM.

[12]  C. Schönwiese,et al.  Overview of Results , 1997 .

[13]  Milad Shokouhi,et al.  Expected browsing utility for web search evaluation , 2010, CIKM.

[14]  Noriko Kando,et al.  On information retrieval metrics designed for evaluation with incomplete relevance assessments , 2008, Information Retrieval.

[15]  Ben Carterette,et al.  System effectiveness, user models, and user utility: a conceptual framework for investigation , 2011, SIGIR.

[16]  Emine Yilmaz,et al.  Estimating average precision when judgments are incomplete , 2007, Knowledge and Information Systems.

[17]  Alistair Moffat,et al.  Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[18]  Donna K. Harman,et al.  Overview of the Fifth Text REtrieval Conference (TREC-5) , 1996, TREC.

[19]  Ellen M. Voorhees,et al.  Retrieval System Evaluation , 2005 .

[20]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[21]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[22]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[23]  Olivier Chapelle,et al.  Expected reciprocal rank for graded relevance , 2009, CIKM.

[24]  Andrew Trotman,et al.  Sound and complete relevance assessment for XML retrieval , 2008, TOIS.

[25]  Charles L. A. Clarke,et al.  Overview of the TREC 2012 Web Track , 2012, TREC.