A large-scale analysis of query logs for assessing personalization opportunities

Query logs, the patterns of activity left by millions of users, contain a wealth of information that can be mined to aid personalization. We perform a large-scale study of Yahoo! search engine logs, tracking 1.35 million browser-cookies over a period of 6 months. We define metrics to address questions such as 1) How much history is available?, 2) How do users' topical interests vary, as reflected by their queries?, and 3) What can we learn from user clicks? We find that there is significantly more expected history for the user of a randomly picked query than for a randomly picked user. We show that users exhibit consistent topical interests that vary between users. We also see that user clicks indicate a variety of special interests. Our findings shed light on user activity and can inform future personalization efforts.

[1]  Chien Chin Chen,et al.  PVA: a self-adaptive personal view agent system , 2001, KDD '01.

[2]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[3]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.

[4]  Clement T. Yu,et al.  Personalized web search by mapping user queries to categories , 2002, CIKM '02.

[5]  Amanda Spink,et al.  A Study of Web Search Trends , 2004, Webology.

[6]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[7]  Mark Claypool,et al.  Implicit interest indicators , 2001, IUI '01.

[8]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[9]  Dennis DeCoste,et al.  Contextual recommender problems [extended abstract] , 2005, UBDM '05.

[10]  Omid Madani Contextual Recommender Problems , .

[11]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[12]  Bernard J. Jansen,et al.  A review of web searching studies and a framework for future research , 2001 .

[13]  Amanda Spink,et al.  How are we searching the World Wide Web? A comparison of nine search engine transaction logs , 2006, Inf. Process. Manag..

[14]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[15]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[16]  Tsvi Kuflik,et al.  Generation of user profiles for information filtering — research agenda (poster session) , 2000, SIGIR '00.

[17]  Amanda Spink,et al.  From E-Sex to E-Commerce: Web Search Changes , 2002, Computer.