论文信息 - Wikipedia-based Unsupervised Query Classification

Wikipedia-based Unsupervised Query Classification

In this paper we present an unsupervised approach to Query Classification. The approach exploits the Wikipedia encyclopedia as a corpus and the statistical distribution of terms, from both the category labels and the query, in order to select an appropriate category. We have created a classifier that works with 55 categories extracted from the search section of the Bridgeman Art Library website. We have also evaluated our approach using the labeled data of the KDD-Cup 2005 Knowledge Discovery and Data Mining competition (800,000 real user queries into 67 target categories) and obtained promising results.

Luca Dini | Alessio Bosca | Milen Kouylekov | Marco Trevisan

[1] Qiang Yang,et al. Building bridges for web query classification , 2006, SIGIR.

[2] Ying Li,et al. KDD CUP-2005 report: facing a great challenge , 2005, SKDD.