Pursuing a Moving Target: Iterative Use of Benchmarking of a Task to Understand the Task

Individual tasks carried out within benchmarking initiatives, or campaigns, enable direct comparison of alternative approaches to tackling shared research challenges and ideally promote new research ideas and foster communities of researchers interested in common or related scientific topics. When a task has a clear predefined use case, it might straightforwardly adopt a well established framework and methodology. For example, an ad hoc information retrieval task adopting the standard Cranfield paradigm. On the other hand, in cases of new and emerging tasks which pose more complex challenges in terms of use scenarios or dataset design, the development of a new task is far from a straightforward process. This letter summarises our reections on our experiences as task organisers of the Search and Hyperlinking task from its origins as a Brave New Task at the MediaEval benchmarking campaign (2011-2014) to its current instantiation as a task at the NIST TRECVid benchmark (since 2015). We highlight the challenges encountered in the development of the task over a number of annual iterations, the solutions found so far, and our process for maintaining a vision for the ongoing advancement of the task's ambition.