A similarity measure for indefinite rankings

Ranked lists are encountered in research and daily life and it is often of interest to compare these lists even when they are incomplete or have only some members in common. An example is document rankings returned for the same query by different search engines. A measure of the similarity between incomplete rankings should handle nonconjointness, weight high ranks more heavily than low, and be monotonic with increasing depth of evaluation; but no measure satisfying all these criteria currently exists. In this article, we propose a new measure having these qualities, namely rank-biased overlap (RBO). The RBO measure is based on a simple probabilistic user model. It provides monotonicity by calculating, at a given depth of evaluation, a base score that is non-decreasing with additional evaluation, and a maximum score that is nonincreasing. An extrapolated score can be calculated between these bounds if a point estimate is required. RBO has a parameter which determines the strength of the weighting to top ranks. We extend RBO to handle tied ranks and rankings of different lengths. Finally, we give examples of the use of the measure in comparing the results produced by public search engines and in assessing retrieval systems in the laboratory.

[1]  L. A. Goodman,et al.  Measures of Association for Cross Classifications III: Approximate Sampling Theory , 1963 .

[2]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[3]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[4]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[5]  Donald E. Knuth,et al.  The art of computer programming: V.1.: Fundamental algorithms , 1997 .

[6]  L. A. Goodman,et al.  Measures of association for cross classifications , 1979 .

[7]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .

[8]  Jean Dickinson Gibbons,et al.  Nonparametric Statistical Inference , 1972, International Encyclopedia of Statistical Science.

[9]  R. Iman,et al.  A measure of top-down correlation , 1987 .

[10]  N. Cliff Ordinal methods for behavioral data analysis , 1996 .

[11]  Grace S. Shieh A weighted Kendall's tau statistic , 1998 .

[12]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[13]  D. Blest Theory & Methods: Rank Correlation — an Alternative Measure , 2000 .

[14]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[15]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[16]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[17]  Shengli Wu,et al.  Methods for ranking information retrieval systems without relevance judgments , 2003, SAC '03.

[18]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[19]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[20]  Chris Buckley,et al.  Topic prediction based on comparative retrieval rankings , 2004, SIGIR '04.

[21]  Judit Bar-Ilan,et al.  Comparing rankings of search results on the Web , 2005, Inf. Process. Manag..

[22]  Alistair Moffat,et al.  Space-Limited Ranked Query Evaluation Using Adaptive Pruning , 2005, WISE.

[23]  Judit Bar-Ilan,et al.  Methods for comparing rankings of search engine results , 2005, Comput. Networks.

[24]  Massimo Melucci,et al.  On rank correlation in information retrieval evaluation , 2007, SIGF.

[25]  A. Tarsitano Nonlinear Rank Correlations , 2008 .

[26]  Alistair Moffat,et al.  Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[27]  Stephen E. Robertson,et al.  A new rank correlation coefficient for information retrieval , 2008, SIGIR '08.

[28]  Ben Carterette,et al.  On rank correlation and the distance between rankings , 2009, SIGIR.

[29]  Massimo Melucci,et al.  Weighted Rank Correlation in Information Retrieval Evaluation , 2009, AIRS.

[30]  Subhabrata Chakraborti,et al.  Nonparametric Statistical Inference , 2011, International Encyclopedia of Statistical Science.