HITSCIR System in NTCIR-9 Subtopic Mining Task

Web queries tend to have multiple user intents. Automatically identifying query intents will benefit search result navigation, search result diversity and personalized search. This paper presents the HITSCIR system in NTCIR-9 subtopic mining task. Firstly, the system collects query intent candidates from multiple resources. Secondly, Affinity Propagation algorithm is applied for clustering these query intent candidates. It could decide the number of clusters automatically. Each cluster has a representative intent candidate called exemplar. Prior preference and heuristic pair-wise preferences could be incorporated in the clustering framework. Finally, the exemplars are ranked by considering each own quality and the popularity of the clusters they represent. The NTCIR-9 evaluation results show that our system could effectively mine query intents with good relevance, diversity and readability.