Lessons from the journey: a query log analysis of within-session learning

The Internet is the largest source of information in the world. Search engines help people navigate the huge space of available data in order to acquire new skills and knowledge. In this paper, we present an in-depth analysis of sessions in which people explicitly search for new knowledge on the Web based on the log files of a popular search engine. We investigate within-session and cross-session developments of expertise, focusing on how the language and search behavior of a user on a topic evolves over time. In this way, we identify those sessions and page visits that appear to significantly boost the learning process. Our experiments demonstrate a strong connection between clicks and several metrics related to expertise. Based on models of the user and their specific context, we present a method capable of automatically predicting, with good accuracy, which clicks will lead to enhanced learning. Our findings provide insight into how search engines might better help users learn as they search.

[1]  John R. Anderson Language, Memory, and Thought , 1976 .

[2]  John Adcock,et al.  Cerchiamo : a collaborative exploratory search tool , 2008 .

[3]  Zhenglu Yang,et al.  Dynamic Adaptation Strategies for Long-Term and Short-Term User Profile to Personalize Search , 2007, APWeb/WAIM.

[4]  Ryen W. White,et al.  Assessing the scenic route: measuring the value of search trails in web logs , 2010, SIGIR.

[5]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part I. Background and Theory , 1997, J. Documentation.

[6]  Rosie Jones,et al.  Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs , 2008, CIKM '08.

[7]  Hinrich Schütze,et al.  Personalized search , 2002, CACM.

[8]  Matthew Hurst,et al.  A Language Model Approach to Keyphrase Extraction , 2003, ACL 2003.

[9]  M. Brysbaert,et al.  Age-of-acquisition ratings for 30,000 English words , 2012, Behavior research methods.

[10]  Matthew Banta,et al.  What do exploratory searchers look at in a faceted search interface? , 2009, JCDL '09.

[11]  Susan T. Dumais,et al.  Potential for personalization , 2010, TCHI.

[12]  Georgios Paliouras,et al.  The ECIR 2010 large scale hierarchical classification workshop , 2010, SIGF.

[13]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[14]  Wei Chu,et al.  Modeling the impact of short- and long-term behavior on search personalization , 2012, SIGIR '12.

[15]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[16]  Nicholas J. Belkin,et al.  Examining users' knowledge change in the task completion process , 2013, Inf. Process. Manag..

[17]  Sofia Stamou,et al.  Impact of search results on user queries , 2009, WIDM.

[18]  Nicholas J. Belkin,et al.  Predicting users' domain knowledge from search behaviors , 2011, SIGIR.

[19]  Ricardo A. Baeza-Yates,et al.  The Intention Behind Web Queries , 2006, SPIRE.

[20]  Barbara M. Wildemuth,et al.  The effects of domain knowledge on search tactic formulation , 2004, J. Assoc. Inf. Sci. Technol..

[21]  Evgeniy Gabrilovich,et al.  To each his own: personalized content selection based on text comprehensibility , 2012, WSDM '12.

[22]  Arjen P. de Vries,et al.  Supporting children's web search in school environments , 2012, IIiX.

[23]  Charles L. A. Clarke,et al.  Classifying and Characterizing Query Intent , 2009, ECIR.

[24]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[25]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[26]  Michael S. Bernstein,et al.  Direct answers for search queries in the long tail , 2012, CHI.

[27]  Susan T. Dumais,et al.  Personalizing atypical web search sessions , 2013, WSDM.

[28]  Zahide Yıldırım,et al.  Comparison of Hypermedia Learning and Traditional Instruction on Knowledge Acquisition and Retention , 2001 .

[29]  Maxine Eskénazi,et al.  Combining Lexical and Grammatical Features to Improve Readability Measures for First and Second Language Texts , 2007, NAACL.

[30]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[31]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[32]  Mounia Lalmas,et al.  Dynamics of Genre and Domain Intents , 2010, AIRS.

[33]  Qiang Yang,et al.  A comparison of implicit and explicit links for web page classification , 2006, WWW '06.

[34]  Ryen W. White,et al.  Characterizing the influence of domain expertise on web search behavior , 2009, WSDM '09.

[35]  Brendan T. O'Connor,et al.  TweetMotif: Exploratory Search and Topic Summarization for Twitter , 2010, ICWSM.

[36]  Thomas Niesler,et al.  A variable-length category-based n-gram language model , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[37]  Ryen W. White,et al.  Personalizing web search results by reading level , 2011, CIKM '11.

[38]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.