论文信息 - Search Result Clustering Using Informatively Named Entities

Search Result Clustering Using Informatively Named Entities

Clustering the results of a search helps the user to review the information gathered. In this article, we regard the clustering task as indexing the search results. Here, an index means a structured label list that can make it easier for the user to comprehend the labels and search results. To realize this goal, we make three proposals. The first is to use Named Entity Extraction for term extraction. The second is to create a new label-selecting criterion based on importance in the search result and the relation between terms and search queries. The third is a label categorization using category information of labels, which is generated by named entity extraction. We implement a prototype system based on these proposals and find that it offers a much higher performance than existing methods; we focus on news articles in this article, but the system is not topic specific.

Ryoji Kataoka | Hiroyuki Toda | Masahiro Oku

[1] Wei-Ying Ma,et al. Learning to cluster web search results , 2004, SIGIR '04.

[2] Shourya Roy,et al. A hierarchical monothetic document clustering algorithm for summarization and browsing search results , 2004, WWW '04.

[3] William W. Cohen,et al. Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text , 2005, HLT.

[4] Tommi S. Jaakkola,et al. Using term informativeness for named entity detection , 2005, SIGIR '05.

[5] Kentaro Torisawa,et al. Extracting Hyponyms of Prespecified Hypernyms from Itemizations and Headings in Web Documents , 2004, COLING.

[6] Anton Leuski,et al. Evaluating document clustering for interactive information retrieval , 2001, CIKM '01.

[7] Hideki Isozaki,et al. Efficient Support Vector Classifiers for Named Entity Recognition , 2002, COLING.

[8] Nigel Collier,et al. Introduction to the Bio-entity Recognition Task at JNLPBA , 2004, NLPBA/BioNLP.

[9] Ralph Grishman,et al. Message Understanding Conference- 6: A Brief History , 1996, COLING.

[10] Yiming Yang,et al. Topic-conditioned novelty detection , 2002, KDD.

[11] Koji Eguchi. Overview of the Topical Classification Task at NTCIR-4 WEB , 2004, NTCIR.