论文信息 - On the Evaluation of Snippet Selection for Information Retrieval

On the Evaluation of Snippet Selection for Information Retrieval

In this paper we take a critical look at the evaluation method of WebCLEF 2007. The suitability of the evaluation method can be seen from two sides, namely from a participating system and a non participating system. A participant has the advantage that the evaluation is partly based upon his output. In this paper we will investigate if the size of the pool of snippets, the implementation of the evaluation method and the quality of the assessments is sufficient enough for reliable evaluation. Unfortunately we have to conclude that the evaluation is not suitable. Therefore some alternative evaluation methods will be discussed concluding in a recommendation to improve the evaluation of WebCLEF.

[1] Hugh E. Williams,et al. Fast generation of result snippets in web search , 2007, SIGIR.

[2] Valentin Jijkoun,et al. Overview of WebCLEF 2007 , 2008, CLEF.

[3] Karen Spärck Jones. Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[4] Justin Zobel,et al. How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[5] Daniel E. Rose,et al. Understanding user goals in web search , 2004, WWW '04.

[6] Valentin Jijkoun,et al. The University of Amsterdam at WebCLEF 2007: Using Centrality to Rank Web Snippets , 2008, CLEF.

[7] Hsinchun Chen,et al. Browsing in hypertext: a cognitive study , 1992, IEEE Trans. Syst. Man Cybern..

[8] James A. Thom,et al. HiXEval: Highlighting XML Retrieval Evaluation , 2005, INEX.

[9] Ellen M. Voorhees,et al. Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[10] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[11] Ellen M. Voorhees,et al. The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[12] Hsinchun Chen,et al. Summary in context: Searching versus browsing , 2006, TOIS.