An Innovative Approach to Data Management and Curation of Experimental Data Generated Through IR Test Collections

This paper describes the steps that led to the invention, design and development of the Distributed Information Retrieval Evaluation Campaign Tool (DIRECT) system for managing and accessing the data used and produced within experimental evaluation in Information Retrieval (IR). We present the context in which DIRECT was conceived, its conceptual model and its extension to make the data available on the Web as Linked Open Data (LOD) by enabling and enhancing their enrichment, discoverability and re-use. Finally, we discuss possible further evolutions of the system.

[1]  Giuseppe Santucci,et al.  Harnessing the Scientific Data Produced by the Experimental Evaluation of Search Engines and Information Access Systems Improved Exploitation of Measures and Analyses in Scientic Production Info Rma Tion , 2022 .

[2]  Chris T. A. Evelo,et al.  Applying linked data approaches to pharmacology: Architectural decisions and implementation , 2014, Semantic Web.

[3]  Anestis Koutsoudis,et al.  RETRIEVAL—An Online Performance Evaluation Tool for Information Retrieval Methods , 2018, IEEE Transactions on Multimedia.

[4]  Mark Sanderson,et al.  Test Collection Based Evaluation of Information Retrieval Systems , 2010, Found. Trends Inf. Retr..

[5]  Pertti Vakkari,et al.  Evaluation methodologies in information retrieval dagstuhl seminar 13441 , 2014, SIGF.

[6]  Alistair Moffat,et al.  Principles for robust evaluation infrastructure , 2011, DESIRE '11.

[7]  Philipp Mayr,et al.  TheSoz: A SKOS representation of the thesaurus for the social sciences , 2012, Semantic Web.

[8]  Paul Buitelaar,et al.  Semantic representation and enrichment of information retrieval experimental data , 2017, International Journal on Digital Libraries.

[9]  Nicola Ferro,et al.  DESIRE 2011: first international workshop on data infrastructures for supporting information retrieval evaluation , 2011, CIKM '11.

[10]  Gianmaria Silvello,et al.  Theory and practice of data citation , 2017, J. Assoc. Inf. Sci. Technol..

[11]  José Luis Vicedo González,et al.  TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[12]  Shawn Bowers Scientific Workflow, Provenance, and Data Modeling Challenges and Approaches , 2012, Journal on Data Semantics.

[13]  Nicola Ferro,et al.  Reproducibility Challenges in Information Retrieval Evaluation , 2017, ACM J. Data Inf. Qual..

[14]  Giorgio Maria Di Nunzio,et al.  Scientific Data of an Evaluation Campaign: Do We Properly Deal With Them? , 2006, CLEF.

[15]  Alistair Moffat,et al.  EvaluatIR: an online tool for evaluating and comparing IR systems , 2009, SIGIR.

[16]  Ellen M. Voorhees,et al.  The Philosophy of Information Retrieval Evaluation , 2001, CLEF.

[17]  Donna Harman,et al.  Information Retrieval Evaluation , 2011, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[18]  Stephen E. Robertson,et al.  On the history of evaluation in IR , 2008, J. Inf. Sci..

[19]  Maria-Esther Vidal,et al.  Editorial: Special Issue on Web Data Quality , 2016, JDIQ.

[20]  Benno Stein,et al.  TIRA: Configuring, Executing, and Disseminating Information Retrieval Experiments , 2012, 2012 23rd International Workshop on Database and Expert Systems Applications.

[21]  Nicola Ferro,et al.  DIRECTions: Design and Specification of an IR Evaluation Infrastructure , 2012, CLEF.

[22]  Giorgio Maria Di Nunzio,et al.  The Importance of Scientific Data Curation for Evaluation Campaigns , 2007, DELOS.

[23]  James Allan,et al.  Frontiers, challenges, and opportunities for information retrieval: Report from SWIRL 2012 the second strategic workshop on information retrieval in Lorne , 2012, SIGF.

[24]  Ellen M. Voorhees,et al.  TREC: Continuing information retrieval's tradition of experimentation , 2007, CACM.

[25]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[26]  Martin Braschler,et al.  PROMISE technology transfer day: spreading the word on information access evaluation at an industrial event , 2012, SIGF.