Overview of CENTRE@CLEF 2018: A First Tale in the Systematic Reproducibility Realm

Reproducibility has become increasingly important for many research areas, among those IR is not an exception and has started to be concerned with reproducibility and its impact on research results. This paper describes our first attempt to propose a lab on reproducibility named CENTRE and held during CLEF 2018. The aim of CENTRE is to run a reproducibility challenge across all the major IR evaluation campaigns and to provide the IR community with a venue where previous research results can be explored and discussed. This paper reports the participant results and preliminary considerations on the first edition of CENTRE@CLEF 2018, as well as some suggestions for future editions.

[1]  Carol Peters,et al.  Comparative Evaluation of Multilingual Information Access Systems , 2003, Lecture Notes in Computer Science.

[2]  Nicola Ferro,et al.  SIGIR Initiative to Implement ACM Artifact Review and Badging , 2018, SIGF.

[3]  Ellen M. Voorhees,et al.  TREC 2014 Web Track Overview , 2015, TREC.

[4]  Alistair Moffat,et al.  Principles for robust evaluation infrastructure , 2011, DESIRE '11.

[5]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[6]  Craig MacDonald,et al.  University of Glasgow at TREC 2014: Experiments with Terrier in Contextual Suggestion, Temporal Summarisation and Web Tracks , 2014, TREC.

[7]  Cheng Luo,et al.  Overview of the NTCIR-13 We Want Web Task , 2017, NTCIR.

[8]  Noriko Kando,et al.  Increasing Reproducibility in IR: Findings from the Dagstuhl Seminar on "Reproducibility of Data-Oriented Experiments in e-Science" , 2016, SIGIR Forum.

[9]  Craig MacDonald,et al.  Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge , 2016, ECIR.

[10]  Andrew Trotman,et al.  Report on the SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR) , 2016, SIGF.

[11]  Ronald W. Shephard,et al.  Mathematics of Statistics, Part One. , 1948 .

[12]  Alistair Moffat,et al.  Has adhoc retrieval improved since 1994? , 2009, SIGIR.

[13]  Carol Peters,et al.  CLEF 2005: Ad Hoc Track Overview , 2005, CLEF.

[14]  Nicola Ferro,et al.  Reproducibility Challenges in Information Retrieval Evaluation , 2017, ACM J. Data Inf. Qual..

[15]  Carol Peters,et al.  CLEF 2008: Ad Hoc Track Overview , 2008, CLEF.

[16]  Charles L. A. Clarke,et al.  Overview of the TREC 2012 Web Track , 2012, TREC.

[17]  Jacques Savoy,et al.  Report on CLEF-2003 Multilingual Tracks , 2003, CLEF.

[18]  J. Shane Culpepper,et al.  RMIT at the NTCIR-13 We Want Web Task , 2017, NTCIR.

[19]  Mark Sanderson,et al.  Examining Additivity and Weak Baselines , 2016, ACM Trans. Inf. Syst..

[20]  Carol Peters,et al.  CLEF 2009 Ad Hoc Track Overview: TEL and Persian Tasks , 2009, CLEF.

[21]  Alistair Moffat,et al.  Improvements that don't add up: ad-hoc retrieval results since 1998 , 2009, CIKM.

[22]  Gilles Falquet,et al.  Ontology-Based Multilingual Information Retrieval , 2005, CLEF.

[23]  Philipp Cimiano,et al.  Cross-language Information Retrieval with Explicit Semantic Analysis , 2008, CLEF.

[24]  Allan Hanbury,et al.  Replicating an Experiment in Cross-lingual Information Retrieval with Explicit Semantic Analysis , 2018, CLEF.

[25]  Hui Fang,et al.  Evaluating the Effectiveness of Axiomatic Approaches in Web Track , 2013, TREC.

[26]  Djoerd Hiemstra,et al.  WikiTranslate: Query Translation for Cross-lingual Information Retrieval using only Wikipedia , 2008, CLEF.