论文信息 - The test collection for navigational retrieval on WWW data-Design and characteristics

The test collection for navigational retrieval on WWW data-Design and characteristics

This paper describes the design and characteristics of a test collection for navigational retrieval of WWW data that was built through the WEB Task of the Fourth NTCIR Workshop to evaluate the retrieval effectiveness of Web search systems. This reusable test collection consists of 100 gigabytes of Web document data and 300 topics of various types and corresponding relevance judgments. Among the several types of ‘Navigational Retrieval,’ we selected the ‘Known Item Search,’ which simulates a situation where a user searches for one or a few ‘representative Web pages’ of a known item. It is assumed that the user knows about the item but may not have seen its Web page. Relevance judgments were performed on the probable documents mainly from the viewpoint of representativeness of respective known items represented by the topics. Using the judgment results, several evaluation measures were applied to various retrieval results. Based on the evaluation results, relationships among the types of topics, Web-page styles and search methods are discussed. The stability of the evaluation results with different numbers of topics is also analyzed.

Keizo Oyama | Akiko Aizawa | Koji Eguchi | Haruko Ishikawa

[1] David Hawking,et al. Overview of the TREC-2001 Web track , 2002 .

[2] Katunobu Itou,et al. Experiments on Web Retrieval Driven by Spontaneously Spoken Queries , 2004, NTCIR.

[3] Peng Zhang,et al. NTCIR-4 WEB Experiments at Osaka Kyoiku University - Static/Dynamic Scoring Using Link Structure Analysis and Web Page Grouping , 2004, NTCIR.

[4] Keizo Oyama,et al. Overview of the NTCIR-4 WEB Navigational Retrieval Task 1 , 2004, NTCIR.

[5] David Hawking,et al. Overview of the TREC 2003 Web Track , 2003, TREC.

[6] Noriko Kando,et al. Overview of the Web Retrieval Task at the Third NTCIR Workshop , 2003, NTCIR.

[7] David Hawking,et al. Overview of the TREC-2002 Web Track , 2002, TREC.

[8] Atsuhiro Takasu,et al. R2D2 at NTCIR-4 Web Retrieval Task , 2004, NTCIR.

[9] Noriko Kando,et al. System Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure , 2003, WWW.

[10] David Hawking,et al. Overview of the TREC-9 Web Track , 2000, TREC.

[11] Ellen M. Voorhees,et al. The effect of topic set size on retrieval experiment error , 2002, SIGIR '02.

[12] Peter Bailey,et al. Overview of the TREC-8 Web Track , 2000, TREC.