Identifying Web search session patterns using cluster analysis: A comparison of three search environments

Session characteristics taken from large transaction logs of three Web search environments (academic Web site, public search engine, consumer health information portal) were modeled using cluster analysis to determine if coherent session groups emerged for each environment and whether the types of session groups are similar across the three environments. The analysis revealed three distinct clusters of session behaviors common to each environment: “hit and run” sessions on focused topics, relatively brief sessions on popular topics, and sustained sessions using obscure terms with greater query modification. The findings also revealed shifts in session characteristics over time for one of the datasets, away from “hit and run” sessions toward more popular search topics. A better understanding of session characteristics can help system designers to develop more responsive systems to support search features that cater to identifiable groups of searchers based on their search behaviors. For example, the system may identify struggling searchers based on session behaviors that match those identified in the current study to provide context sensitive help.

[1]  Jin Zhang,et al.  Visualization of health-subject analysis based on query term co-occurrences , 2008 .

[2]  Jin Zhang,et al.  Mining web search behaviors: Strategies and techniques for data modeling and analysis , 2007, ASIST.

[3]  Amanda Spink,et al.  Web Search: Public Searching of the Web , 2011, Information Science and Knowledge Management.

[4]  Ricardo A. Baeza-Yates,et al.  Modeling user search behavior , 2005, Third Latin American Web Congress (LA-WEB'2005).

[5]  Peiling Wang,et al.  Mining longitudinal web queries: Trends and patterns , 2003, J. Assoc. Inf. Sci. Technol..

[6]  John McKechnie,et al.  Modelling information seeking behaviour of AEC professionals on online technical information resources , 2003, J. Inf. Technol. Constr..

[7]  Christos Faloutsos,et al.  Trends and Patterns of WWW Browsing Behavior , 2000 .

[8]  Michael K. Ng,et al.  A Cube Model and Cluster Analysis for Web Access Sessions , 2001, WEBKDD.

[9]  Dale Schuurmans,et al.  Dynamic Web log session identification with statistical language models , 2004, J. Assoc. Inf. Sci. Technol..

[10]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[11]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[12]  Michael D. Cooper,et al.  Using clustering techniques to detect usage patterns in a Web-based information system , 2001, J. Assoc. Inf. Sci. Technol..

[13]  Ophir Frieder,et al.  Temporal analysis of a very large topically categorized Web query log , 2007, J. Assoc. Inf. Sci. Technol..

[14]  Michael D. Cooper,et al.  Usage patterns of a web-based library catalog , 2001, J. Assoc. Inf. Sci. Technol..

[15]  Deborah D. Blecic,et al.  A Longitudinal Study of the Effects of OPAC Screen Changes on Searching Behavior and Searcher Success , 1999 .

[16]  Nancy C. M. Ross,et al.  End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine , 2000, J. Am. Soc. Inf. Sci..

[17]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[18]  Micheal D. Cooper Predicting the relevance of a library catalog search , 2001 .

[19]  Daqing He,et al.  Combining evidence for automatic Web session identification , 2002, Inf. Process. Manag..

[20]  Dietmar Wolfram,et al.  Search characteristics in different types of Web-based IR environments: Are they the same? , 2008, Inf. Process. Manag..

[21]  Jimmy Lin,et al.  Identification of user sessions with hierarchical agglomerative clustering , 2006, ASIST.

[22]  Daqing He,et al.  Analysing Web Search Logs to Determine Session Boundaries for User-Oriented Learning , 2000, AH.

[23]  D. Wolfram Term co-occurrence in Internet queries : An analysis of the Excite data base , 1999 .

[24]  Amanda Spink,et al.  Defining a session on Web search engines , 2007, J. Assoc. Inf. Sci. Technol..

[25]  Amanda Spink,et al.  From E-Sex to E-Commerce: Web Search Changes , 2002, Computer.

[26]  Christoph Hölscher How Internet Experts Search For Information On The Web , 1998, WebNet.

[27]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[28]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[29]  Jun Li,et al.  A Model Search Engine Based on Cluster Analysis of User Search Terms , 2005 .

[30]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[31]  Hua Li,et al.  Demographic prediction based on user's browsing behavior , 2007, WWW '07.

[32]  Jin Zhang,et al.  Modeling Web session behavior using cluster analysis: A comparison of three search settings , 2007, ASIST.

[33]  J. Novak Learning, Creating, and Using Knowledge , 2009 .

[34]  Jie Li,et al.  Characterizing typical and atypical user sessions in clickstreams , 2008, WWW.

[35]  Huseyin Cenk Özmutlu,et al.  Application of automatic topic identification on Excite Web search engine data logs , 2005, Inf. Process. Manag..