Subtopic Mining via Modifier Graph Clustering

Understanding the information need encoded in a user query has long been regarded as a crucial step of effective information retrieval. In this paper, we focus on subtopic mining that aims at generating a ranked list of subtopic strings for a given topic. We propose the modifier graph based approach, under which the problem of subtopic mining reduces to that of graph clustering over the modifier graph. Compared with the existing methods, the experimental results show that our modifier-graph based approaches are robust to the sparseness problem. In particular, our approaches that perform subtopic mining at a fine-grained term-level outperform the baseline methods that perform subtopic mining at a whole query-level in terms of I-rec, D-nDCG and D#-nDCG.

[1]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.

[2]  Filip Radlinski,et al.  Improving personalized web search using result diversification , 2006, SIGIR.

[3]  Lu Wang,et al.  Clustering query refinements by user intent , 2010, WWW '10.

[4]  Hongbo Deng,et al.  Entropy-biased models for query representation on the click graph , 2009, SIGIR.

[5]  Yiqun Liu,et al.  Overview of the NTCIR-10 INTENT-2 Task , 2013, NTCIR.

[6]  Andreas Noack,et al.  Energy Models for Graph Clustering , 2007, J. Graph Algorithms Appl..

[7]  Qiang Yang,et al.  PQC: personalized query classification , 2009, CIKM.

[8]  Yang Song,et al.  Query suggestion by constructing term-transition graphs , 2012, WSDM '12.

[9]  Filip Radlinski,et al.  Inferring query intent from reformulations and clicks , 2010, WWW '10.

[10]  Sreenivas Gollapudi,et al.  Diversifying search results , 2009, WSDM '09.

[11]  Yiqun Liu,et al.  Overview of the NTCIR-9 INTENT Task , 2011, NTCIR.

[12]  Aristides Gionis,et al.  The query-flow graph: model and applications , 2008, CIKM '08.

[13]  Fuji Ren,et al.  Role-explicit query identification and intent role annotation , 2012, CIKM '12.

[14]  Deepayan Chakrabarti,et al.  Mining broad latent query aspects from search sessions , 2009, KDD.

[15]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[16]  Fabrizio Silvestri,et al.  Efficient query recommendations in the long tail via center-piece subgraphs , 2012, SIGIR '12.

[17]  Xiaoxin Yin,et al.  Building taxonomy of web search intents for name entity queries , 2010, WWW '10.

[18]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[19]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[20]  Qinghua Zheng,et al.  Mining query subtopics from search log data , 2012, SIGIR '12.

[21]  Yiqun Liu,et al.  Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification , 2013, SIGIR.

[22]  Fabrizio Silvestri,et al.  Recommendations for the long tail by term-query graph , 2011, WWW.

[23]  Fuji Ren,et al.  Class-indexing-based term weighting for automatic text classification , 2013, Inf. Sci..

[24]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[25]  Tetsuya Sakai,et al.  Evaluating diversified search results using per-intent graded relevance , 2011, SIGIR.