Understanding Ephemeral State of Relevance

Despite its dynamic nature, relevance is often measured in a context-independent manner in information retrieval practice. We look into this discrepancy. We propose a contextual relevance/usefulness measurement called ephemeral state of relevance (ESR), which is defined as the amount of useful information a user acquired from a clicked result as assessed just after examining the result during an interactive search session. We collect ESR and context-independent usefulness judgments through a laboratory user study and compare the two. We examine factors related to both judgments and examine their differences. Our study demonstrates a few advantages of ESR: it captures users' real-time state of mind and perceptions; it measures how much useful information the user is able to acquire from a result rather than how much there is in the result; it better reflects users' needs and criteria of useful results during a session, highlighting novelty as a salient factor. However, we also find that users may not be able to correctly assess the credibility of information during a session, which may reduce the reliability of the collected ESR judgments. We evaluate ESR, context-independent usefulness judgments, and TREC-style topical relevance judgments by correlating with user experience in a session. The results demonstrate that switching the judgment criterion from topical relevance to usefulness is fruitful, but moving from context-independent judgments to contextual ones has only limited advantages with respect to its cost and complexity. Our study enriches current understanding on the dynamics of relevance in a search session and identifies both opportunities and challenges for collecting contextual relevance judgments.

[1]  T. Saracevic,et al.  Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance , 2007, J. Assoc. Inf. Sci. Technol..

[2]  Carol Collier Kuhlthau,et al.  Inside the search process: Information seeking from the user's perspective , 1991, J. Am. Soc. Inf. Sci..

[3]  Donna K. Harman,et al.  Overview of the Fifth Text REtrieval Conference (TREC-5) , 1996, TREC.

[4]  Hans C. Jessen,et al.  Applied Logistic Regression Analysis , 1996 .

[5]  Stefano Mizzaro,et al.  Relevance: The Whole History , 1997, J. Am. Soc. Inf. Sci..

[6]  Pertti Vakkari,et al.  Changes in relevance criteria and problem stages in task performance , 2000, J. Documentation.

[7]  S. Menard Applied Logistic Regression Analysis , 1996 .

[8]  Rong Tang,et al.  Use of relevance criteria across stages of document evaluation: On the complementarity of experimental and naturalistic studies , 2001, J. Assoc. Inf. Sci. Technol..

[9]  Pia Borlund,et al.  The concept of relevance in IR , 2003, J. Assoc. Inf. Sci. Technol..

[10]  Preben Hansen,et al.  Effects of foreign language and task scenario on relevance assessment , 2005, J. Documentation.

[11]  Elaine Toms,et al.  Searching for Relevance in the Relevance of Search , 2005, CoLIS.

[12]  Joemon M. Jose,et al.  How users assess Web pages for information seeking , 2005, J. Assoc. Inf. Sci. Technol..

[13]  Yunjie Calvin Xu,et al.  Relevance judgment: What do information users consider beyond topicality? , 2006, J. Assoc. Inf. Sci. Technol..

[14]  SaracevicTefko Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance , 2007 .

[15]  Nicholas J. Belkin,et al.  A faceted approach to conceptualizing tasks in information seeking , 2008, Inf. Process. Manag..

[16]  Dong Wang,et al.  Order effect in relevance judgment , 2008, J. Assoc. Inf. Sci. Technol..

[17]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[18]  J. Liu,et al.  Usefulness as the Criterion for Evaluation of Interactive Information Retrieval , 2009 .

[19]  Nicholas J. Belkin,et al.  A Model for Evaluation of Interactive Information Retrieval , 2009 .

[20]  Jacek Gwizdka,et al.  Search behaviors in different task types , 2010, JCDL '10.

[21]  Nicholas J. Belkin,et al.  Personalizing information retrieval for multi-session tasks: the roles of task stage and task type , 2010, SIGIR '10.

[22]  James Allan,et al.  Predicting searcher frustration , 2010, SIGIR.

[23]  Ahmed Hassan Awadallah,et al.  Beyond DCG: user behavior as a predictor of a successful search , 2010, WSDM '10.

[24]  Jacek Gwizdka,et al.  Task and user effects on reading patterns in information search , 2011, Interact. Comput..

[25]  Nicholas J. Belkin,et al.  Personalization of search results using interaction behaviors in search sessions , 2012, SIGIR '12.

[26]  Jin Zhang,et al.  Multidimensional relevance modeling via psychometrics and crowdsourcing , 2014, SIGIR.

[27]  Leif Azzopardi,et al.  Modelling interaction with economic models of search , 2014, SIGIR.

[28]  Ellen M. Voorhees,et al.  TREC 2014 Web Track Overview , 2015, TREC.

[29]  Nicholas J. Belkin,et al.  Predicting Search Task Difficulty at Different Search Stages , 2014, CIKM.

[30]  Daqing He,et al.  Searching, browsing, and clicking in a search session: changes in user behavior by task and over time , 2014, SIGIR.

[31]  Nicholas J. Belkin Salton Award Lecture: People, Interacting with Information , 2015, SIGIR.

[32]  Ryen W. White,et al.  Understanding and Predicting Graded Search Satisfaction , 2015, WSDM.

[33]  Jaime Arguello,et al.  Development and Evaluation of Search Tasks for IIR Experiments using a Cognitive Complexity Framework , 2015, ICTIR.

[34]  Guido Zuccon,et al.  An Analysis of the Cost and Benefit of Search Interactions , 2016, ICTIR.

[35]  Chaoqun Ni,et al.  What Affects Word Changes in Query Reformulation During a Task-based Search Session? , 2016, CHIIR.

[36]  Ben Carterette,et al.  Evaluating Retrieval over Sessions: The TREC Session Track 2011-2014 , 2016, SIGIR.

[37]  Yiqun Liu,et al.  When does Relevance Mean Usefulness and User Satisfaction in Web Search? , 2016, SIGIR.

[38]  Emine Yilmaz,et al.  On Obtaining Effort Based Judgements for Information Retrieval , 2016, WSDM.

[39]  Ellen M. Voorhees,et al.  Overview of the TREC 2014 Web Track , 2017 .