Identification and Analysis of Multi-tasking Product Information Search Sessions with Query Logs

Abstract Purpose This research aims to identify product search tasks in online shopping and analyze the characteristics of consumer multi-tasking search sessions. Design/methodology/approach The experimental dataset contains 8,949 queries of 582 users from 3,483 search sessions. A sequential comparison of the Jaccard similarity coefficient between two adjacent search queries and hierarchical clustering of queries is used to identify search tasks. Findings (1) Users issued a similar number of queries (1.43 to 1.47) with similar lengths (7.3–7.6 characters) per task in mono-tasking and multi-tasking sessions, and (2) Users spent more time on average in sessions with more tasks, but spent less time for each task when the number of tasks increased in a session. Research limitations The task identification method that relies only on query terms does not completely reflect the complex nature of consumer shopping behavior. Practical implications These results provide an exploratory understanding of the relationships among multiple shopping tasks, and can be useful for product recommendation and shopping task prediction. Originality/value The originality of this research is its use of query clustering with online shopping task identification and analysis, and the analysis of product search session characteristics.

[1]  Fabrizio Silvestri,et al.  Discovering tasks from search engine query logs , 2013, TOIS.

[2]  Amit Bhatnagar,et al.  Online information search termination patterns across product categories and consumer demographics , 2004 .

[3]  James Allan,et al.  Task-aware query recommendation , 2013, SIGIR.

[4]  Nicholas J. Belkin,et al.  Validation of a model of information seeking over multiple search sessions , 2005, J. Assoc. Inf. Sci. Technol..

[5]  Fabrizio Silvestri,et al.  Identifying task-based sessions in search engine query logs , 2011, WSDM '11.

[6]  B. Ratchford,et al.  Consumer information search revisited: Theory and empirical analysis , 1997 .

[7]  Max L. Wilson,et al.  A user defined taxonomy of factors that divide online information retrieval sessions , 2014, IIiX.

[8]  Vijay V. Raghavan,et al.  On the reuse of past optimal queries , 1995, SIGIR '95.

[9]  J. Rowley Product search in e‐shopping: a review and research propositions , 2000 .

[10]  Wei Chu,et al.  Learning to extract cross-session search tasks , 2013, WWW.

[11]  Daqing He,et al.  Searching, browsing, and clicking in a search session: changes in user behavior by task and over time , 2014, SIGIR.

[12]  Amanda Spink,et al.  Multitasking during Web search sessions , 2006, Inf. Process. Manag..

[13]  Kalervo Järvelin,et al.  s-grams: Defining generalized n-grams for information retrieval , 2007, Inf. Process. Manag..

[14]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[15]  Amanda Spink,et al.  Multitasking information seeking and searching processes , 2002, J. Assoc. Inf. Sci. Technol..

[16]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.