Personalized Concept-Based Clustering of Search Engine Queries

The exponential growth of information on the Web has introduced new challenges for building effective search engines. A major problem of Web search is that search queries are usually short and ambiguous, and thus are insufficient for specifying the precise user needs. To alleviate this problem, some search engines suggest terms that are semantically related to the submitted queries so that users can choose from the suggestions the ones that reflect their information needs. In this paper, we introduce an effective approach that captures the user's conceptual preferences in order to provide personalized query suggestions. We achieve this goal with two new strategies. First, we develop online techniques that extract concepts from the Web-snippets of the search result returned from a query and use the concepts to identify related queries for that query. Second, we propose a new two-phase personalized agglomerative clustering algorithm that is able to generate personalized query clusters. To the best of the authors' knowledge, no previous work has addressed personalization for query suggestions. To evaluate the effectiveness of our technique, a Google middleware was developed for collecting clickthrough data to conduct experimental evaluation. Experimental results show that our approach has better precision and recall than the existing query clustering methods.

[1]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[2]  Wilfred Ng,et al.  Spying Out Accurate User Preferences for Search Engine Adaptation , 2004, WebKDD.

[3]  Bjoern Koester,et al.  Conceptual Knowledge Retrieval with FooCA: Improving Web Search Engine Results with Contexts and Concept Hierarchies , 2006, ICDM.

[4]  Barry Smyth,et al.  Exploiting Query Repetition and Regularity in an Adaptive Community-Based Web Search Engine , 2004, User Modeling and User-Adapted Interaction.

[5]  Clement T. Yu,et al.  Personalized Web search for improving retrieval effectiveness , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[7]  Wei-Ying Ma,et al.  Query Expansion by Mining User Logs , 2003, IEEE Trans. Knowl. Data Eng..

[8]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[9]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[10]  Ke Wang,et al.  Privacy-enhancing personalized web search , 2007, WWW '07.

[11]  Dik Lun Lee,et al.  Clustering search engine query log containing noisy clickthroughs , 2004, 2004 International Symposium on Applications and the Internet. Proceedings..

[12]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[13]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[14]  Filip Radlinski,et al.  Search Engines that Learn from Implicit Feedback , 2007, Computer.

[15]  Wilfred Ng,et al.  Applying Co-training to Clickthrough Data for Search Engine Adaptation , 2004, DASFAA.

[16]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[17]  Kenneth Ward Church,et al.  Using Statistics in Lexical Analysis , 2003, Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon.

[18]  Olfa Nasraoui,et al.  Mining search engine query logs for query recommendation , 2006, WWW '06.

[19]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[20]  SpinkAmanda,et al.  Real life information retrieval: a study of user queries on the Web , 1998 .

[21]  Ji-Rong Wen,et al.  WWW 2007 / Track: Search Session: Personalization A Largescale Evaluation and Analysis of Personalized Search Strategies ABSTRACT , 2022 .

[22]  Uri Zernik,et al.  Lexical acquisition: Exploiting on-line resources to build a lexicon. , 1991 .

[23]  Ji-Rong Wen,et al.  Query clustering using user logs , 2002, TOIS.

[24]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[25]  Boris Mirkin,et al.  Mathematical Classification and Clustering , 1996 .

[26]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[27]  Shui-Lung Chuang,et al.  Automatic query taxonomy generation for information retrieval applications , 2003, Online Inf. Rev..