KDEIM at NTCIR-12 IMine-2 Search Intent Mining Task: Query Understanding through Diversified Ranking of Subtopics

In this paper, we describe our participation in the Query Understanding subtask of the NTCIR-12 IMINE Task. We propose a method that extracts subtopics by leveraging the query suggestions from search engines. The importance of the subtopics with the query is estimated by exploiting multiple query-dependent and query-independent features with supervised feature selection. To diversify the subtopics, we employ maximummarginal relevance (MMR) framework based diversification technique by balancing the relevance and novelty. The best performance of our method achieves an I-rec of 0.7557, a D-nDCG of 0.6644, a D#-nDCG of 0.7100, and a QU-score of 0.5057 at the cutoff rank 10 for query understanding task.

[1]  Charles L. A. Clarke,et al.  Novelty and diversity in information retrieval evaluation , 2008, SIGIR '08.

[2]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[3]  Rodrygo L. T. Santos Explicit web search result diversification , 2013, SIGF.

[4]  Yong Yu,et al.  Identification of ambiguous queries in web search , 2009, Inf. Process. Manag..

[5]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[6]  Yiqun Liu,et al.  Overview of the NTCIR-12 IMine-2 Task , 2016, NTCIR.

[7]  Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, National Center of Sciences, Tokyo, Japan, June 7-10, 2016 , 2016, NTCIR.

[8]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Stephen E. Robertson,et al.  Ambiguous requests: implications for retrieval tests, systems and theories , 2007, SIGF.

[11]  Se-Jong Kim,et al.  Subtopic mining using simple patterns and hierarchical structure of subtopic candidates from web documents , 2015, Inf. Process. Manag..

[12]  Hsin-Hsi Chen,et al.  Mining subtopics from different aspects for diversifying search results , 2012, Information Retrieval.

[13]  Arjen P. de Vries,et al.  Combining implicit and explicit topic representations for result diversification , 2012, SIGIR '12.

[14]  Nattiya Kanhabua,et al.  Leveraging Dynamic Query Subtopics for Time-Aware Search Result Diversification , 2014, ECIR.

[15]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[16]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[17]  Hong Cheng,et al.  Diversifying Search Results through Pattern-Based Subtopic Modeling , 2012, Int. J. Semantic Web Inf. Syst..

[18]  Gang Wang,et al.  Understanding user's query intent with wikipedia , 2009, WWW '09.

[19]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[20]  Tapas Kanungo,et al.  Machine Learned Sentence Selection Strategies for Query-Biased Summarization , 2008 .