论文信息 - Overview of the Topical Classification Task at NTCIR-4 WEB

Overview of the Topical Classification Task at NTCIR-4 WEB

This paper gives an overview of the Topical Classification Task 1 that was conducted from 2003 to 2004 as one of the pilot experiments of the WEB Task at the Fourth NTCIR Workshop (‘NTCIR-4 WEB’). In this Topical Classification Task, we attempted to assess the effectiveness of automatic classification systems for retrieved documents from Web search engine systems from a viewpoint of topical relevance. Here we use the “Topical Classification” as a general term, and so various techniques, such as text categorization or document clustering, can be ways of creating classification of the documents. For the classification task we used a target data set comprising ranked lists of search result documents from 100-gigabyte document data, which were mainly gathered from the ‘.jp’ domain. We carried out an evaluation of automatic classification systems on the basis of the information retrieval task. We applied several evaluation measures that are often used in information retrieval evaluation. We also proposed new evaluation measures considering the number of classes.

Koji Eguchi

[1] Javed A. Aslam,et al. Models for metasearch , 2001, SIGIR '01.

[2] Koji Eguchi,et al. Adaptive document clustering using incrementally expanded queries , 2001, Systems and Computers in Japan.

[3] Jaana Kekäläinen,et al. IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[4] Ellen M. Voorhees,et al. Evaluation by highly relevant documents , 2001, SIGIR '01.

[5] S. Griffis. EDITOR , 1997, Journal of Navigation.

[6] Marti A. Hearst,et al. Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[7] Noriko Kando,et al. System Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure , 2003, WWW.

[8] Keizo Oyama,et al. Overview of the Informational Retrieval Task at NTCIR-4 WEB , 2004, NTCIR.

[9] Oren Etzioni,et al. Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[10] Noriko Kando,et al. Overview of the Web Retrieval Task at the Third NTCIR Workshop , 2003, NTCIR.