论文信息 - Evaluating Search and Hyperlinking: An Example of the Design, Test, Refine Cycle for Metric Development

Evaluating Search and Hyperlinking: An Example of the Design, Test, Refine Cycle for Metric Development

Designing meaninful metrics for evaluating MediaEval tasks that are able to capture multiple aspects of system eectiveness and user satisfaction is far from straighforward. A considerable part of the eort in organising such a task must often be devoted to selecting, designing or rening a suitable evaluation metric. We review evaluation metrics from the MediaEval Search and Hyperlinkiing task, illustrating the motivation behind metrics proposed for the task, and how reection on results has led to iterative metric renement in subsequent campaigns.

Gareth J. F. Jones | David Nicolas Racca

[1] Douglas W. Oard,et al. One-sided measures for evaluating ranked retrieval effectiveness with spontaneous conversational speech , 2006, SIGIR '06.

[2] Maria Eskevich,et al. SAVA at MediaEval 2015: Search and Anchoring in Video Archives , 2015, MediaEval.

[3] Maria Eskevich,et al. The Search and Hyperlinking Task at MediaEval 2013 , 2013, MediaEval.

[4] Gabriella Kazai,et al. Tolerance to irrelevance: a user-effort oriented evaluation of retrieval systems without predefined retrieval unit , 2004 .

[5] Ryen W. White,et al. Overview of the CLEF-2006 Cross-Language Speech Retrieval Track , 2006, CLEF.

[6] Pavel Pecina,et al. CUNI at MediaEval 2014 Search and Hyperlinking Task: Search Task Experiments , 2014, MediaEval.

[7] Maria Eskevich,et al. New Metrics for Meaningful Evaluation of Informally Structured Speech Retrieval , 2012, ECIR.

[8] Maria Eskevich,et al. Adapting Binary Information Retrieval Evaluation Metrics for Segment-based Retrieval Tasks , 2013, ArXiv.

[9] Iadh Ounis,et al. Research directions in Terrier: a search engine for advanced retrieval on the Web , 2007 .