Evaluating the Effectiveness of Personalized Web Search

Although personalized search has been under way for many years and many personalization algorithms have been investigated, it is still unclear whether personalization is consistently effective on different queries for different users and under different search contexts. In this paper, we study this problem and provide some findings. We present a large-scale evaluation framework for personalized search based on query logs and then evaluate five personalized search algorithms (including two click-based ones and three topical-interest-based ones) using 12-day query logs of Windows Live Search. By analyzing the results, we reveal that personalized Web search does not work equally well under various situations. It represents a significant improvement over generic Web search for some queries, while it has little effect and even harms query performance under some situations. We propose click entropy as a simple measurement on whether a query should be personalized. We further propose several features to automatically predict when a query will benefit from a specific personalization algorithm. Experimental results show that using a personalization algorithm for queries selected by our prediction model is better than using it simply for all queries.

[1]  LiYing,et al.  KDD CUP-2005 report , 2005 .

[2]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[3]  ChengXiang Zhai,et al.  Mining long-term search history to improve search accuracy , 2006, KDD '06.

[4]  Alexander Pretschner,et al.  Ontology based personalized search , 1999, Proceedings 11th International Conference on Tools with Artificial Intelligence.

[5]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[6]  Paolo Ferragina,et al.  A personalized search engine based on Web‐snippet hierarchical clustering , 2005, WWW '05.

[7]  ChengXiang Zhai,et al.  Implicit user modeling for personalized search , 2005, CIKM '05.

[8]  Qiang Yang,et al.  Q2C@UST: our winning solution to query classification in KDDCUP 2005 , 2005, SKDD.

[9]  Clement T. Yu,et al.  Personalized Web search for improving retrieval effectiveness , 2004, IEEE Transactions on Knowledge and Data Engineering.

[10]  Feng Qiu,et al.  Automatic identification of user interest for personalized search , 2006, WWW '06.

[11]  Susan Gauch,et al.  Personal ontologies for web navigation , 2000, CIKM '00.

[12]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[13]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[14]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[15]  András A. Benczúr,et al.  To randomize or not to randomize: space optimal summaries for hyperlink analysis , 2006, WWW '06.

[16]  Omid Madani,et al.  A large-scale analysis of query logs for assessing personalization opportunities , 2006, KDD '06.

[17]  Susan Gauch,et al.  Improving Ontology-Based User Profiles , 2004, RIAO.

[18]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[19]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[20]  Wolfgang Nejdl,et al.  Using ODP metadata to personalize search , 2005, SIGIR '05.

[21]  W. Bruce Croft,et al.  Query performance prediction in web search environments , 2007, SIGIR.

[22]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[23]  Elad Yom-Tov,et al.  What makes a query difficult? , 2006, SIGIR.

[24]  Yinglian Xie,et al.  Locality in search engine queries and its implications for caching , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[25]  W. Bruce Croft,et al.  Quantifying query ambiguity , 2002 .

[26]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[28]  Xuehua Shen,et al.  Context-sensitive information retrieval using implicit feedback , 2005, SIGIR '05.

[29]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[30]  Clement T. Yu,et al.  Personalized web search by mapping user queries to categories , 2002, CIKM '02.

[31]  P. Gildenberg History Repeats Itself , 2004, Stereotactic and Functional Neurosurgery.

[32]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[33]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[34]  Ji-Rong Wen,et al.  Personalized Web Search , 2009, Encyclopedia of Database Systems.

[35]  Mary Beth Rosson,et al.  Paradox of the active user , 1987 .

[36]  Thorsten Joachims,et al.  Evaluating Retrieval Performance Using Clickthrough Data , 2003, Text Mining.

[37]  Hinrich Schütze,et al.  Personalized search , 2002, CACM.

[38]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[39]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[40]  Wolfgang Nejdl,et al.  Summarizing local context to personalize global web search , 2006, CIKM '06.

[41]  Mao Yang,et al.  PacificA: Replication in Log-Based Distributed Storage Systems , 2008 .

[42]  Susan T. Dumais,et al.  To personalize or not to personalize: modeling queries with variation in user intent , 2008, SIGIR '08.

[43]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[44]  Alexander Pretschner,et al.  Ontology-based personalized search and browsing , 2003, Web Intell. Agent Syst..

[45]  P.-C.-F. Daunou,et al.  Mémoire sur les élections au scrutin , 1803 .

[46]  Ling Liu,et al.  Encyclopedia of Database Systems , 2009, Encyclopedia of Database Systems.

[47]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[48]  Masatoshi Yoshikawa,et al.  Adaptive web search based on user profile constructed without any effort from users , 2004, WWW '04.

[49]  Francisco Tanudjaja,et al.  Persona: a contextualized and personalized web search , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[50]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[51]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[52]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[53]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[54]  Ying Li,et al.  KDD CUP-2005 report: facing a great challenge , 2005, SKDD.

[55]  Jaime Teevan,et al.  History repeats itself: repeat queries in Yahoo's logs , 2006, SIGIR '06.

[56]  Susan T. Dumais,et al.  Beyond the Commons: Investigating the Value of Personalizing Web Search , 2005 .

[57]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[58]  Yong Yu,et al.  Identifying ambiguous queries in web search , 2007, WWW '07.

[59]  Edward Cutrell,et al.  An eye tracking study of the effect of target rank on web search , 2007, CHI.