论文信息 - Reproducibility and Validity in CLEF

Reproducibility and Validity in CLEF

In this paper, we investigate CLEF’s contribution to the reproducibility of IR experiments. After discussing the concepts of reproducibility and validity, we show that CLEF has not only produced test collections that can be re-used by other researchers, but also undertaken various efforts in enabling reproducibility.

Norbert Fuhr | N. Fuhr

[1] Julio Gonzalo,et al. iCLEF 2004 Track Overview: Pilot Experiments in Interactive Cross-Language Question Answering , 2004, CLEF.

[2] Tsvi Kuflik,et al. The Dagstuhl Perspectives Workshop on Performance Modeling and Prediction , 2018, SIGF.

[3] Noriko Kando,et al. Increasing Reproducibility in IR: Findings from the Dagstuhl Seminar on "Reproducibility of Data-Oriented Experiments in e-Science" , 2016, SIGIR Forum.

[4] Krisztian Balog,et al. Extended Overview of the Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab 2015 , 2015, CLEF.

[5] Alistair Moffat,et al. Improvements that don't add up: ad-hoc retrieval results since 1998 , 2009, CIKM.

[6] Nicola Ferro,et al. DIRECTions: Design and Specification of an IR Evaluation Infrastructure , 2012, CLEF.

[7] Michael C. Frank,et al. Estimating the reproducibility of psychological science , 2015, Science.

[8] Khalid Choukri,et al. Information Filtering Evaluation: Overview of CLEF 2009 INFILE Track , 2009, CLEF.

[9] Giuseppe Santucci,et al. A Visual Analytics Approach for What-If Analysis of Information Retrieval Systems , 2016, SIGIR.

[10] Ian H. Witten,et al. Chapter 15 – Embedded Machine Learning , 2011 .

[11] Ben Carterette,et al. Multiple testing in statistical analysis of systems-based information retrieval experiments , 2012, TOIS.