Do Topic Shift and Query Reformulation Patterns Correlate in Academic Search?

While it is known that academic searchers differ from typical web searchers, little is known about the search behavior of academic searchers over longer periods of time. In this study we take a look at academic searchers through a large-scale log analysis on a major academic search engine. We focus on two aspects: query reformulation patterns and topic shifts in queries. We first analyze how each of these aspects evolve over time. We identify important query reformulation patterns: revisiting and issuing new queries tend to happen more often over time. We also find that there are two distinct types of users: one type of users becomes increasingly focused on the topics they search for as time goes by, and the other becomes increasingly diversifying. After analyzing these two aspects separately, we investigate whether, and to which degree, there is a correlation between topic shifts and query reformulations. Surprisingly, users’ preferences of query reformulations correlate little with their topic shift tendency. However, certain reformulations may help predict the magnitude of the topic shift that happens in the immediate next timespan. Our results shed light on academic searchers’ information seeking behavior and may benefit search personalization.

[1]  Daqing He,et al.  Users ’ Perceived Difficulties and Corresponding Reformulation Strategies in Google Voice Search , 2016 .

[2]  Hao-Ren Ke,et al.  Exploring behavior of E-journal users in science and technology: Transaction log analysis of Elsevier's ScienceDirect OnSite in Taiwan , 2002 .

[3]  Filip Radlinski,et al.  Inferring query intent from reformulations and clicks , 2010, WWW '10.

[4]  Jaime Teevan The re:search engine: simultaneous support for finding and re-finding , 2007, UIST '07.

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Filippo Menczer,et al.  Behavior-driven clustering of queries into topics , 2011, CIKM '11.

[7]  Haixun Wang,et al.  Identifying users' topical tasks in web search , 2013, WSDM.

[8]  Bradley M. Hemminger,et al.  Information seeking behavior of academic scientists , 2007, J. Assoc. Inf. Sci. Technol..

[9]  Peter Bruza,et al.  Query Reformulation on the Internet: Empirical Data and the Hyperindex Search Engine , 1997, RIAO.

[10]  Hongbo Deng,et al.  Behavior Driven Topic Transition for Search Task Identification , 2016, WWW.

[11]  M. de Rijke,et al.  Click Models for Web Search , 2015, Click Models for Web Search.

[12]  Amanda Spink,et al.  Patterns of query reformulation during Web searching , 2009 .

[13]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Pu-Jen Cheng,et al.  Learning user reformulation behavior for query auto-completion , 2014, SIGIR.

[15]  Ali Shiri,et al.  Query reformulation strategies in an interdisciplinary digital library: The case of nanoscience and technology , 2010, 2010 Fifth International Conference on Digital Information Management (ICDIM).

[16]  Sally Jo Cunningham,et al.  A transaction log analysis of a digital library , 2000, International Journal on Digital Libraries.

[17]  Milad Shokouhi,et al.  Mobile query reformulations , 2014, SIGIR.

[18]  Qinghua Zheng,et al.  Mining query subtopics from search log data , 2012, SIGIR '12.

[19]  Ivan Lee,et al.  Examining collaborative query reformulation: a case of travel information searching , 2014, SIGIR.

[20]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.

[21]  Soo Young Rieh,et al.  Analysis of multiple query reformulations on the web: The interactive information retrieval context , 2006, Information Processing & Management.

[22]  Ann Blandford,et al.  Keeping up to date: An academic researcher's information journey , 2017, J. Assoc. Inf. Sci. Technol..

[23]  Francesco Bonchi,et al.  Query reformulation mining: models, patterns, and applications , 2011, Information Retrieval.

[24]  M. de Rijke,et al.  Information Processing and Management Investigating Queries and Search Failures in Academic Search , 2022 .

[25]  Daqing He,et al.  Users’ Perceived Difficulties and Corresponding Reformulation Strategies in Voice Search , 2013 .

[26]  Aristides Gionis,et al.  Query similarity by projecting the query-flow graph , 2010, SIGIR.

[27]  David E. Over,et al.  Reasoning and Rationality , 1987 .

[28]  M. de Rijke,et al.  A Survey of Query Auto Completion in Information Retrieval , 2016, Found. Trends Inf. Retr..

[29]  D A Lindberg,et al.  Internet access to the National Library of Medicine. , 2000, Effective clinical practice : ECP.

[30]  Daqing He,et al.  User participation in an academic social networking service: A survey of open group users on Mendeley , 2014, J. Assoc. Inf. Sci. Technol..

[31]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[32]  Ophir Frieder,et al.  Temporal analysis of a very large topically categorized Web query log , 2007, J. Assoc. Inf. Sci. Technol..

[33]  Dietmar Wolfram,et al.  Log Analysis of Academic Digital Library: User Query Patterns , 2014 .

[34]  Grace Hui Yang,et al.  Utilizing query change for session search , 2013, SIGIR.

[35]  Ann Blandford,et al.  Understanding “influence:” an exploratory study of academics' processes of knowledge construction through iterative and interactive information seeking , 2015, J. Assoc. Inf. Sci. Technol..

[36]  Jie Tang,et al.  AMiner: Toward Understanding Big Scholar Data , 2016, WSDM.

[37]  Emine Yilmaz,et al.  Uncovering Task Based Behavioral Heterogeneities in Online Search Behavior , 2016, SIGIR.

[38]  Bradley M. Hemminger,et al.  National study of information seeking behavior of academic researchers in the United States , 2010, J. Assoc. Inf. Sci. Technol..

[39]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.