Towards Stochastic Simulations of Relevance Profiles

Recently proposed methods allow the generation of simulated scores representing the values of an effectiveness metric, but they do not investigate the generation of the actual lists of retrieved documents. In this paper we address this limitation: we present an approach that exploits an evolutionary algorithm and, given a metric score, creates a simulated relevance profile (i.e., a ranked list of relevance values) that produces that score. We show how the simulated relevance profiles are realistic under various analyses.

[1]  Charles L. A. Clarke,et al.  Reciprocal rank fusion outperforms condorcet and individual rank learning methods , 2009, SIGIR.

[2]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[3]  Rabia Nuray-Turan,et al.  Automatic ranking of information retrieval systems using data fusion , 2006, Inf. Process. Manag..

[4]  Ryen W. White,et al.  Evaluating implicit feedback models using searcher simulations , 2005, TOIS.

[5]  Emine Yilmaz,et al.  A simple and efficient sampling method for estimating AP and NDCG , 2008, SIGIR '08.

[6]  Josiane Mothe,et al.  Query Performance Prediction and Effectiveness Evaluation Without Relevance Judgments: Two Sides of the Same Coin , 2018, SIGIR.

[7]  Kalervo Järvelin,et al.  Modeling behavioral factors ininteractive information retrieval , 2013, CIKM.

[8]  Stephen E. Robertson,et al.  On per-topic variance in IR evaluation , 2012, SIGIR '12.

[9]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[10]  Julián Urbano,et al.  Test collection reliability: a study of bias and robustness to statistical assumptions via stochastic simulation , 2016, Information Retrieval Journal.

[11]  Stefano Mizzaro,et al.  Reproduce. Generalize. Extend. On Information Retrieval Evaluation without Relevance Judgments , 2018, ACM J. Data Inf. Qual..

[12]  Ryen W. White Using searcher simulations to redesign a polyrepresentative implicit feedback interface , 2006, Inf. Process. Manag..

[13]  Michael D. Cooper,et al.  A simulation model of an information retrieval system , 1973, Inf. Storage Retr..

[14]  J. Shane Culpepper,et al.  Risk-Reward Trade-offs in Rank Fusion , 2017, ADCS.

[15]  A. James 2010 , 2011, Philo of Alexandria: an Annotated Bibliography 2007-2016.

[16]  Maura R. Grossman,et al.  Beyond Pooling , 2018, SIGIR.

[17]  Stefano Mizzaro,et al.  Reproduce and Improve , 2018, ACM J. Data Inf. Qual..

[18]  Alistair Moffat,et al.  A similarity measure for indefinite rankings , 2010, TOIS.

[19]  Julián Urbano,et al.  Stochastic Simulation of Test Collections: Evaluation Scores , 2018, SIGIR.

[20]  David Maxwell,et al.  Agents, Simulated Users and Humans: An Analysis of Performance and Behaviour , 2016, CIKM.

[21]  M. de Rijke,et al.  Building simulated queries for known-item topics: an analysis using six european languages , 2007, SIGIR.

[22]  Jean Tague-Sutcliffe,et al.  Problems in the simulation of bibliographic retrieval systems , 1980, SIGIR '80.