Evaluating Search and Hyperlinking: An Example of the Design, Test, Refine Cycle for Metric Development

Designing meaninful metrics for evaluating MediaEval tasks that are able to capture multiple aspects of system eectiveness and user satisfaction is far from straighforward. A considerable part of the eort in organising such a task must often be devoted to selecting, designing or rening a suitable evaluation metric. We review evaluation metrics from the MediaEval Search and Hyperlinkiing task, illustrating the motivation behind metrics proposed for the task, and how reection on results has led to iterative metric renement in subsequent campaigns.