A large-scale evaluation and analysis of personalized search strategies

Although personalized search has been proposed for many years and many personalization strategies have been investigated, it is still unclear whether personalization is consistently effective on different queries for different users, and under different search contexts. In this paper, we study this problem and get some preliminary conclusions. We present a large-scale evaluation framework for personalized search based on query logs, and then evaluate five personalized search strategies (including two click-based and three profile-based ones) using 12-day MSN query logs. By analyzing the results, we reveal that personalized search has significant improvement over common web search on some queries but it also has little effect on other queries (e.g., queries with small click entropy). It even harms search accuracy under some situations. Furthermore, we show that straightforward click-based personalization strategies perform consistently and considerably well, while profile-based ones are unstable in our experiments. We also reveal that both long-term and short-term contexts are very important in improving search performance for profile-based personalized search strategies.

[1]  P.-C.-F. Daunou,et al.  Mémoire sur les élections au scrutin , 1803 .

[2]  Mary Beth Rosson,et al.  Paradox of the active user , 1987 .

[3]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[4]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[5]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[6]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[7]  Alexander Pretschner,et al.  Ontology based personalized search , 1999, Proceedings 11th International Conference on Tools with Artificial Intelligence.

[8]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[9]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[10]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[11]  Clement T. Yu,et al.  Personalized web search by mapping user queries to categories , 2002, CIKM '02.

[12]  Hinrich Schütze,et al.  Personalized search , 2002, CACM.

[13]  Francisco Tanudjaja,et al.  Persona: a contextualized and personalized web search , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[14]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[15]  W. Bruce Croft,et al.  Quantifying query ambiguity , 2002 .

[16]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[17]  Yinglian Xie,et al.  Locality in search engine queries and its implications for caching , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[18]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[19]  Thorsten Joachims,et al.  Evaluating Retrieval Performance Using Clickthrough Data , 2003, Text Mining.

[20]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[21]  Masatoshi Yoshikawa,et al.  Adaptive web search based on user profile constructed without any effort from users , 2004, WWW '04.

[22]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[23]  Wolfgang Nejdl,et al.  Using ODP metadata to personalize search , 2005, SIGIR '05.

[24]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[25]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[26]  Ying Li,et al.  KDD CUP-2005 report: facing a great challenge , 2005, SKDD.

[27]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[28]  Xuehua Shen,et al.  Context-sensitive information retrieval using implicit feedback , 2005, SIGIR '05.

[29]  Paolo Ferragina,et al.  A personalized search engine based on Web‐snippet hierarchical clustering , 2005, WWW '05.

[30]  Qiang Yang,et al.  Q2C@UST: our winning solution to query classification in KDDCUP 2005 , 2005, SKDD.

[31]  Susan T. Dumais,et al.  Beyond the Commons: Investigating the Value of Personalizing Web Search , 2005 .

[32]  ChengXiang Zhai,et al.  Implicit user modeling for personalized search , 2005, CIKM '05.

[33]  Omid Madani,et al.  A large-scale analysis of query logs for assessing personalization opportunities , 2006, KDD '06.

[34]  Jaime Teevan,et al.  History repeats itself: repeat queries in Yahoo's logs , 2006, SIGIR '06.

[35]  ChengXiang Zhai,et al.  Mining long-term search history to improve search accuracy , 2006, KDD '06.

[36]  Feng Qiu,et al.  Automatic identification of user interest for personalized search , 2006, WWW '06.

[37]  Wolfgang Nejdl,et al.  Summarizing local context to personalize global web search , 2006, CIKM '06.