论文信息 - Overview of the NTCIR-16 We Want Web with CENTRE (WWW-4) Task

Overview of the NTCIR-16 We Want Web with CENTRE (WWW-4) Task

This is an overview of the NTCIR-16 We Want Web with CENTRE (WWW-4) task, the fourth round of an evaluation series that aims to quantify the progress and reproducibility of web search algorithms in offline ad hoc retrieval settings. For WWW-4, we introduced a new English web corpus, which we named Chuweb21. Moreover, in addition to bronze relevance assessments (i.e., those given by assessors who are neither topic creators nor topic experts), we collected gold relevance assessments (i.e., those given by topic creators). We received 18 runs from 4 teams, including two runs from the organiser team. We describe the task, data, evaluation measures, and report on the official evaluation results.

T. Sakai | I. Soboroff | N. Ferro

[1] T. Sakai,et al. Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents? , 2022, ACM Trans. Inf. Syst..

[2] Jingtao Zhan. THUIR at the NTCIR-16 WWW-4 Task , 2022 .

[3] T. Sakai,et al. SLWWW at the NTCIR-16 WWW-4 Task , 2022 .

[4] Philipp Schaer,et al. repro_eval: A Python Interface to Reproducibility Measures of System-Oriented IR Experiments , 2022, ECIR.

[5] T. Sakai,et al. Retrieval Evaluation Measures that Agree with Users’ SERP Preferences , 2020, ACM Trans. Inf. Syst..

[6] Tetsuya Sakai,et al. How to Measure the Reproducibility of System-oriented IR Experiments , 2020, SIGIR.

[7] Zhicheng Dou,et al. Overview of the NTCIR-15 We Want Web with CENTRE (WWW-3) Task , 2020 .

[8] Zhaohao Zeng,et al. SLWWW at the NTCIR-15 WWW-3 Task , 2020 .

[9] Jimmy J. Lin,et al. Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval , 2019, EMNLP.

[10] Tetsuya Sakai,et al. Which Diversity Evaluation Measures Are "Good"? , 2019, SIGIR.

[11] Tetsuya Sakai,et al. How to Run an Evaluation Task - With a Primary Focus on Ad Hoc Information Retrieval , 2019, Information Retrieval Evaluation in a Changing World.