Stratified analysis of AOL query log

Characterizing user's intent and behaviour while using a retrieval information tool (e.g. a search engine) is a key question on web research, as it hold the keys to know how the users interact, what they are expecting and how we can provide them information in the most beneficial way. Previous research has focused on identifying the average characteristics of user interactions. This paper proposes a stratified method for analyzing query logs that groups queries and sessions according to their hit frequency and analyzes the characteristics of each group in order to find how representative the average values are. Findings show that behaviours typically associated with the average user do not fit in most of the aforementioned groups.

[1]  Craig Silverstein,et al.  Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[2]  Daqing He,et al.  Detecting session boundaries from Web user logs , 2000 .

[3]  Kenneth Ward Church,et al.  Entropy of search logs: how hard is search? with personalization? with backoff? , 2008, WSDM '08.

[4]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[5]  Christoph Hölscher How Internet Experts Search For Information On The Web , 1998, WebNet.

[6]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[7]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[8]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[9]  Santo Fortunato,et al.  Ranking web sites with real user traffic , 2008, WSDM '08.

[10]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[11]  Hsiao-Tieh Pu,et al.  An analysis of failed queries for web image retrieval , 2008, J. Inf. Sci..

[12]  Jie Li,et al.  Characterizing typical and atypical user sessions in clickstreams , 2008, WWW.

[13]  Bernard J. Jansen,et al.  A review of web searching studies and a framework for future research , 2001 .

[14]  Massimo Barbaro,et al.  A Face Is Exposed for AOL Searcher No , 2006 .

[15]  Amanda Spink,et al.  U.S. versus European web searching trends , 2002, SIGF.

[16]  Amanda Spink,et al.  Real life information retrieval: a study of user queries on the Web , 1998, SIGF.

[17]  R. Armstrong The Long Tail: Why the Future of Business Is Selling Less of More , 2008 .

[18]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[19]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[20]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[21]  Amanda Spink,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..

[22]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[23]  Eugene Agichtein,et al.  Towards Privacy-Preserving Query Log Publishing , 2007 .

[24]  Eytan Adar,et al.  User 4XXXXX9: Anonymizing Query Logs , 2007 .

[25]  Steve Krug,et al.  Don't Make Me Think!: A Common Sense Approach to Web Usability , 2000 .

[26]  Xavier Sarrate Donaire Don't make me think: a common sense approach to web usability.2nd ed. , 2009 .

[27]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[28]  Daniel Gayo-Avello,et al.  Automatic detection of navigational queries according to Behavioural Characteristics , 2008, LWA.