论文信息 - Learning from homologous queries and semantically related terms for query auto completion - 字舞流文

Learning from homologous queries and semantically related terms for query auto completion

We propose a learning to rank based query auto completion model (L2R-QAC) that exploits contributions from so-called homologous queries for a QAC candidate, in which two kinds of homologous queries are taken into account.We propose semantic features for QAC, using the semantic relatedness of terms inside a query candidate and of pairs of terms from a candidate and from queries previously submitted in the same session.We analyze the effectiveness of our L2R-QAC model with newly added features, and find that it significantly outperforms state-of-the-art QAC models, either based on learning to rank or on popularity. Query auto completion (QAC) models recommend possible queries to web search users when they start typing a query prefix. Most of today's QAC models rank candidate queries by popularity (i.e., frequency), and in doing so they tend to follow a strict query matching policy when counting the queries. That is, they ignore the contributions from so-called homologous queries, queries with the same terms but ordered differently or queries that expand the original query. Importantly, homologous queries often express a remarkably similar search intent. Moreover, today's QAC approaches often ignore semantically related terms. We argue that users are prone to combine semantically related terms when generating queries.We propose a learning to rank-based QAC approach, where, for the first time, features derived from homologous queries and semantically related terms are introduced. In particular, we consider: (i) the observed and predicted popularity of homologous queries for a query candidate; and (ii) the semantic relatedness of pairs of terms inside a query and pairs of queries inside a session. We quantify the improvement of the proposed new features using two large-scale real-world query logs and show that the mean reciprocal rank and the success rate can be improved by up to 9% over state-of-the-art QAC models.

M. de Rijke | Maarten de Rijke | Fei Cai | Fei Cai

[1] Michael R. Lyu,et al. Learning latent semantic relations from clickthrough data for query suggestion , 2008, CIKM '08.

[2] Qiang Wu,et al. Learning to Rank Using an Ensemble of Lambda-Gradient Models , 2010, Yahoo! Learning to Rank Challenge.

[3] João Gama,et al. A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[4] Enhong Chen,et al. Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion , 2011, TIST.

[5] de RijkeMaarten,et al. Learning from homologous queries and semantically related terms for query auto completion , 2016 .

[6] Michael Gertz,et al. CONQUER: a system for efficient context-aware query suggestions , 2011, WWW.

[7] Umut Ozertem,et al. Learning to suggest: a machine learning framework for ranking query suggestions , 2012, SIGIR '12.

[8] Milad Shokouhi,et al. Learning to personalize query auto-completion , 2013, SIGIR.

[9] Joemon M. Jose,et al. Recent and robust query auto-completion , 2014, WWW.

[10] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[11] Kenneth Ward Church,et al. Query suggestion using hitting time , 2008, CIKM '08.

[12] Xueqi Cheng,et al. Intent-aware query similarity , 2011, CIKM '11.

[13] Craig MacDonald,et al. Learning to rank query suggestions for adhoc and diversity search , 2012, Information Retrieval.

[14] Nick Craswell,et al. Proceedings of the 2009 workshop on Web Search Click Data, WSCD@WSDM 2009, Barcelona, Spain, February 9, 2009 , 2009, WSCD@WSDM.

[15] M. de Rijke,et al. Click Models for Web Search , 2015, Click Models for Web Search.

[16] Milad Shokouhi,et al. Time-sensitive query auto-completion , 2012, SIGIR '12.

[17] Ziv Bar-Yossef,et al. Context-sensitive query auto-completion , 2011, WWW.

[18] Yang Liu,et al. Adaptive query suggestion for difficult queries , 2012, SIGIR '12.

[19] M. de Rijke,et al. Time-sensitive Personalized Query Auto-Completion , 2014, CIKM.

[20] Mike Thelwall,et al. Synthesis Lectures on Information Concepts, Retrieval, and Services , 2009 .

[21] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[22] Jonathan Weese,et al. UMBC_EBIQUITY-CORE: Semantic Textual Similarity Systems , 2013, *SEMEVAL.

[23] M. de Rijke,et al. Personalized document re-ranking based on Bayesian probabilistic matrix factorization , 2014, SIGIR.

[24] Yehuda Koren,et al. Expediting search trend detection via prediction of query counts , 2013, WSDM.

[25] Hongbo Deng,et al. A two-dimensional click model for query auto-completion , 2014, SIGIR.

[26] Gilad Mishne,et al. Finding high-quality content in social media , 2008, WSDM '08.

[27] Filip Radlinski,et al. On user interactions with query auto-completion , 2014, SIGIR.

[28] Enhong Chen,et al. Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[29] Tie-Yan Liu,et al. Learning to rank for information retrieval , 2009, SIGIR.

[30] Susan T. Dumais,et al. Understanding temporal query dynamics , 2011, WSDM '11.

[31] Bhaskar Mitra,et al. An Eye-tracking Study of User Interactions with Query Auto Completion , 2014, CIKM.

[32] Huizhong Duan,et al. Online spelling correction for query completion , 2011, WWW.

[33] Abdur Chowdhury,et al. A picture of search , 2006, InfoScale '06.

[34] Steve Chien,et al. Semantic similarity between search engine queries using temporal correlation , 2005, WWW '05.

[35] P. A. Blight. The Analysis of Time Series: An Introduction , 1991 .

[36] Benjamin Rey,et al. Generating query substitutions , 2006, WWW '06.

[37] Pu-Jen Cheng,et al. Learning user reformulation behavior for query auto-completion , 2014, SIGIR.