论文信息 - Infrastructure support for evaluation as a service

Infrastructure support for evaluation as a service

How do we conduct large-scale community-wide evaluations for information retrieval if we are unable to distribute the document collection? This was the challenge we faced in organizing a task on searching tweets at the Text Retrieval Conference (TREC), since Twitter's terms of service forbid redistribution of tweets. Our solution, which we call "evaluation as a service", was to provide an API through which the collection can be accessed for completing the evaluation task. This paper describes the infrastructure underlying the service and its deployment at TREC 2013. We discuss the merits of the approach and potential applicability to other evaluation scenarios.

Jimmy J. Lin | Miles Efron | Miles Efron

[1] Ellen M. Voorhees,et al. Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[2] Mark Sanderson,et al. Information retrieval system evaluation: effort, sensitivity, and reliability , 2005, SIGIR '05.

[3] Anand Swaminathan,et al. Information Retrieval System Evaluation , 2012 .

[4] Gordon V. Cormack,et al. TREC 2006 Spam Track Overview , 2006, TREC.

[5] Fabian Steeg,et al. Information-Retrieval: Evaluation , 2010 .

[6] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[7] Iadh Ounis,et al. Overview of the TREC 2011 Microblog Track , 2011, TREC.

[8] Craig MacDonald,et al. Overview of the TREC-2012 Microblog Track , 2012, Text Retrieval Conference.

[9] Ellen M. Voorhees,et al. The Philosophy of Information Retrieval Evaluation , 2001, CLEF.