A characterization of online browsing behavior

In this paper, we undertake a large-scale study of online user behavior based on search and toolbar logs. We propose a new CCS taxonomy of pageviews consisting of Content (news, portals, games, verticals, multimedia), Communication (email, social networking, forums, blogs, chat), and Search (Web search, item search, multimedia search). We show that roughly half of all pageviews online are content, one-third are communications, and the remaining one-sixth are search. We then give further breakdowns to characterize the pageviews within each high-level category. We then study the extent to which pages of certain types are revisited by the same user over time, and the mechanisms by which users move from page to page, within and across hosts, and within and across page types. We consider robust schemes for assigning responsibility for a pageview to ancestors along the chain of referrals. We show that mail, news, and social networking pageviews are insular in nature, appearing primarily in homogeneous sessions of one type. Search pageviews, on the other hand, appear on the path to a disproportionate number of pageviews, but cannot be viewed as the principal mechanism by which those pageviews were reached. Finally, we study the burstiness of pageviews associated with a URL, and show that by and large, online browsing behavior is not significantly affected by "breaking" material with non-uniform visit frequency.

[1]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[2]  Catarina Sismeiro,et al.  A Model of Web Site Browsing Behavior Estimated on Clickstream Data , 2003 .

[3]  Ravi Kumar,et al.  An analysis framework for search sequences , 2009, CIKM.

[4]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[5]  Ryen W. White,et al.  Mining the search trails of surfing crowds: identifying relevant websites from user activity , 2008, WWW.

[6]  Wei-Ying Ma,et al.  Probabilistic query expansion using query logs , 2002, WWW '02.

[7]  Doug Downey,et al.  Models of Searching and Browsing: Languages, Studies, and Application , 2007, IJCAI.

[8]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[9]  Jaime Teevan,et al.  Information re-retrieval: repeat queries in Yahoo's logs , 2007, SIGIR.

[10]  Young-Hoon Park,et al.  Modeling Browsing Behavior at Multiple Websites , 2004 .

[11]  Chao Liu,et al.  Click chain model in web search , 2009, WWW '09.

[12]  Amanda Spink,et al.  Multitasking during Web search sessions , 2006, Inf. Process. Manag..

[13]  Monika Henzinger,et al.  Purely URL-based topic classification , 2009, WWW '09.

[14]  Doug Downey,et al.  Understanding the relationship between searchers' queries and information goals , 2008, CIKM '08.

[15]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[16]  Stuart K. Card,et al.  A taxonomic analysis of what world wide web activities significantly impact people's decisions and actions , 2001, CHI Extended Abstracts.

[17]  Eelco Herder Characterizations of User Web Revisit Behavior , 2005, LWA.

[18]  Rosie Jones,et al.  Query word deletion prediction , 2003, SIGIR.

[19]  Saul Greenberg,et al.  How people revisit web pages: empirical findings and implications for the design of history systems , 1997, Int. J. Hum. Comput. Stud..

[20]  Ryen W. White,et al.  Talking the talk vs. walking the walk: salience of information needs in querying vs. browsing , 2008, SIGIR '08.

[21]  Matthias Baumgarten,et al.  User-Driven Navigation Pattern Discovery from Internet Data , 1999, WEBKDD.

[22]  Philipp Mayr Website entries from a web log file perspective : a new log file measure , 2004 .

[23]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[24]  Padhraic Smyth,et al.  Model-Based Clustering and Visualization of Navigation Patterns on a Web Site , 2003, Data Mining and Knowledge Discovery.

[25]  Ravi Kumar,et al.  A Characterization of Online Search Behavior , 2009, IEEE Data Eng. Bull..

[26]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[27]  Peter S. Fader,et al.  On the Depth and Dynamics of Online Search Behavior , 2004, Manag. Sci..

[28]  Andy Cockburn,et al.  What do web users do? An empirical analysis of web use , 2001, Int. J. Hum. Comput. Stud..

[29]  Tie-Yan Liu,et al.  BrowseRank: letting web users vote for page importance , 2008, SIGIR '08.

[30]  Eelco Herder,et al.  Web page revisitation revisited: implications of a long-term click-stream study of browser usage , 2007, CHI.

[31]  Christos Faloutsos,et al.  Identifying Web Browsing Trends and Patterns , 2001, Computer.

[32]  Susan T. Dumais,et al.  Resonance on the web: web dynamics and revisitation patterns , 2009, CHI.

[33]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..