Performance prediction using spatial autocorrelation

Evaluation of information retrieval systems is one of the core tasks in information retrieval. Problems include the inability to exhaustively label all documents for a topic, generalizability from a small number of topics, and incorporating the variability of retrieval systems. Previous work addresses the evaluation of systems, the ranking of queries by difficulty, and the ranking of individual retrievals by performance. Approaches exist for the case of few and even no relevance judgments. Our focus is on zero-judgment performance prediction of individual retrievals. One common shortcoming of previous techniques is the assumption of uncorrelated document scores and judgments. If documents are embedded in a high-dimensional space (as they often are), we can apply techniques from spatial data analysis to detect correlations between document scores. We find that the low correlation between scores of topically close documents often implies a poor retrieval performance. When compared to a state of the art baseline, we demonstrate that the spatial analysis of retrieval scores provides significantly better prediction performance. These new predictors can also be incorporated with classic predictors to improve performance further. We also describe the first large-scale experiment to evaluate zero-judgment performance prediction for a massive number of retrieval systems over a variety collections in several languages.

[1]  Elad Yom-Tov,et al.  What makes a query difficult? , 2006, SIGIR.

[2]  Javed A. Aslam,et al.  Relevance score normalization for metasearch , 2001, CIKM '01.

[3]  C. J. van Rijsbergen,et al.  The use of hierarchic clustering in information retrieval , 1971, Inf. Storage Retr..

[4]  Jennifer Neville,et al.  Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning , 2002, ICML.

[5]  Javed A. Aslam,et al.  Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions , 2007, ECIR.

[6]  Ian Soboroff,et al.  Ranking retrieval systems without relevance judgments , 2001, SIGIR '01.

[7]  James Allan,et al.  Minimal test collections for retrieval evaluation , 2006, SIGIR.

[8]  D. Griffith Spatial Autocorrelation and Spatial Filtering , 2003 .

[9]  Iadh Ounis,et al.  Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[10]  Oren Kurland,et al.  Corpus structure, language models, and ad hoc information retrieval , 2004, SIGIR '04.

[11]  W. Bruce Croft,et al.  Ranking robustness: a novel framework to predict query performance , 2006, CIKM '06.

[12]  James Allan,et al.  UMass at TDT 2004 , 2004 .

[13]  Fernando Diaz,et al.  Using temporal profiles of queries for precision prediction , 2004, SIGIR '04.

[14]  D. Griffith Spatial Autocorrelation , 2020, Spatial Analysis Methods and Practice.

[15]  Emine Yilmaz,et al.  A statistical method for system evaluation using incomplete judgments , 2006, SIGIR.

[16]  Ingemar J. Cox,et al.  On ranking the effectiveness of searches , 2006, SIGIR.

[17]  Tao Qin,et al.  A study of relevance propagation for web search , 2005, SIGIR '05.

[18]  W. Bruce Croft,et al.  Precision prediction based on ranked list coherence , 2006, Information Retrieval.

[19]  Fernando Diaz,et al.  Regularizing ad hoc retrieval scores , 2005, CIKM '05.