Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs

Most analysis of web search relevance and performance takes a single query as the unit of search engine interaction. When studies attempt to group queries together by task or session, a timeout is typically used to identify the boundary. However, users query search engines in order to accomplish tasks at a variety of granularities, issuing multiple queries as they attempt to accomplish tasks. In this work we study real sessions manually labeled into hierarchical tasks, and show that timeouts, whatever their length, are of limited utility in identifying task boundaries, achieving a maximum precision of only 70%. We report on properties of this search task hierarchy, as seen in a random sample of user interactions from a major web search engine's log, annotated by human editors, learning that 17% of tasks are interleaved, and 20% are hierarchically organized. No previous work has analyzed or addressed automatic identification of interleaved and hierarchically organized search tasks. We propose and evaluate a method for the automated segmentation of users' query streams into hierarchical units. Our classifiers can improve on timeout segmentation, as well as other previously published approaches, bringing the accuracy up to 92% for identifying fine-grained task boundaries, and 89-97% for identifying pairs of queries from the same task when tasks are interleaved hierarchically. This is the first work to identify, measure and automatically segment sequences of user queries into their hierarchical structure. The ability to perform this kind of segmentation paves the way for evaluating search engines in terms of user task completion.

[1]  Scott B. Huffman,et al.  How well does result relevance predict session satisfaction? , 2007, SIGIR.

[2]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[3]  Peter G. Anick Using terminological feedback for web search refinement: a log-based study , 2003, SIGIR.

[4]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[5]  Amanda Spink,et al.  Use of query reformulation and relevance feedback by Excite users , 2000, Internet Res..

[6]  James Pustejovsky,et al.  Automatic construction of faceted terminological feedback for context-based information retrieval , 1999 .

[7]  Amanda Spink,et al.  Defining a session on Web search engines: Research Articles , 2007 .

[8]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[9]  Jaime Teevan,et al.  History repeats itself: repeat queries in Yahoo's logs , 2006, SIGIR '06.

[10]  Amanda Spink,et al.  Multitasking during Web search sessions , 2006, Inf. Process. Manag..

[11]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[12]  Amanda Spink,et al.  Defining a session on Web search engines , 2007, J. Assoc. Inf. Sci. Technol..

[13]  Daqing He,et al.  Combining evidence for automatic Web session identification , 2002, Inf. Process. Manag..

[14]  Doug Downey,et al.  Models of Searching and Browsing: Languages, Studies, and Application , 2007, IJCAI.

[15]  Christos Faloutsos,et al.  Identifying Web Browsing Trends and Patterns , 2001, Computer.

[16]  Huseyin Cenk Özmutlu,et al.  Application of automatic topic identification on Excite Web search engine data logs , 2005, Inf. Process. Manag..

[17]  S. Sheather Density Estimation , 2004 .

[18]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[19]  Seda Ozmutlu Automatic new topic identification using multiple linear regression , 2006 .

[20]  Amanda Spink,et al.  Investigating the performance of automatic new topic identification across multiple datasets , 2006, ASIST.