论文信息 - Overview of the NTCIR-15 We Want Web with CENTRE (WWW-3) Task

Overview of the NTCIR-15 We Want Web with CENTRE (WWW-3) Task

This is an overview of the NTCIR-15 We Want Web with CENTRE (WWW-3) task. The task features the Chinese subtask (adhoc web search) and the English subtask (adhoc web search, replicability and reproducibility), and received 48 runs from 9 teams. We describe the subtasks, data, evaluation measures, and the official evaluation results.

[1] Tetsuya Sakai,et al. Randomised vs. Prioritised Pools for Relevance Assessments: Sample Size Considerations , 2019, AIRS.

[2] Tetsuya Sakai,et al. Which Diversity Evaluation Measures Are "Good"? , 2019, SIGIR.

[3] Tetsuya Sakai,et al. How to Run an Evaluation Task - With a Primary Focus on Ad Hoc Information Retrieval , 2019, Information Retrieval Evaluation in a Changing World.

[4] Tetsuya Sakai,et al. Overview of CENTRE@CLEF 2018: A First Tale in the Systematic Reproducibility Realm , 2018, CLEF.

[5] Yiqun Liu,et al. Sogou-QCL: A New Dataset with Click Relevance Label , 2018, SIGIR.

[6] Enrique Amigó,et al. An Axiomatic Analysis of Diversity Evaluation Metrics: Introducing the Rank-Biased Utility Metric , 2018, SIGIR.

[7] Makoto P. Kato,et al. Overview of NTCIR-13 , 2017, NTCIR.

[8] Cheng Luo,et al. Overview of the NTCIR-13 We Want Web Task , 2017, NTCIR.

[9] Tetsuya Sakai,et al. Topic set size design , 2015, Information Retrieval Journal.

[10] Tetsuya Sakai,et al. Metrics, Statistics, Tests , 2013, PROMISE Winter School.

[11] Alistair Moffat,et al. A similarity measure for indefinite rankings , 2010, TOIS.

[12] Olivier Chapelle,et al. Expected reciprocal rank for graded relevance , 2009, CIKM.

[13] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[14] Noriko Kando,et al. Overview of the NTCIR-7 ACLIA IR4QA Task , 2008, NTCIR.

[15] R. Forthofer,et al. Rank Correlation Methods , 1981 .