Reliability and Validity of Query Intent Assessments

In most intent recognition studies, annotations of query intent are created post hoc by external assessors who are not the searchers themselves. It is important for the field to get a better understanding of the quality of this process as an approximation for determining the searcher's actual intent. Some studies have investigated the reliability of the query intent annotation process by measuring the interassessor agreement. However, these studies did not measure the validity of the judgments, that is, to what extent the annotations match the searcher's actual intent. In this study, we asked both the searchers themselves and external assessors to classify queries using the same intent classification scheme. We show that of the seven dimensions in our intent classification scheme, four can reliably be used for query annotation. Of these four, only the annotations on the topic and spatial sensitivity dimension are valid when compared with the searcher's annotations. The difference between the interassessor agreement and the assessor-searcher agreement was significant on all dimensions, showing that the agreement between external assessors is not a good estimator of the validity of the intent classifications. Therefore, we encourage the research community to consider using query intent classifications by the searchers themselves as test data. © 2013 ASIS&T.

[1]  Monika Henzinger,et al.  Analysis of a very large web search engine query log , 1999, SIGF.

[2]  Charles L. A. Clarke,et al.  Classifying and Characterizing Query Intent , 2009, ECIR.

[3]  Tetsuya Sakai Evaluation with informational and navigational intents , 2012, WWW.

[4]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[5]  Craig MacDonald,et al.  Intent-aware search result diversification , 2011, SIGIR.

[6]  Dirk Lewandowski,et al.  Deriving query intents from web search engine queries , 2012, J. Assoc. Inf. Sci. Technol..

[7]  George Buchanan,et al.  Information Seeking by Humanities Scholars , 2005, ECDL.

[8]  Sebastian Rudolph,et al.  Ontology-Based Interpretation of Keywords for Semantic Search , 2007, ISWC/ASWC.

[9]  Eric Horvitz,et al.  Patterns of search: analyzing and modeling Web query refinement , 1999 .

[10]  Ricardo Baeza-Yates,et al.  Towards a Deeper Understanding of the User’s Query Intent , 2010 .

[11]  Daniel Gayo-Avello,et al.  A survey on session detection methods in query logs and a proposal for future evaluation , 2009, Inf. Sci..

[12]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[13]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part I. Background and Theory , 1997, J. Documentation.

[14]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part II. Results of a Design Study , 1982, J. Documentation.

[15]  Noriko Kando,et al.  Differences between informational and transactional tasks in information seeking on the web , 2008, IIiX.

[16]  Daniel Gayo-Avello,et al.  Survey and evaluation of query intent detection methods , 2009, WSCD '09.

[17]  Ricardo A. Baeza-Yates,et al.  The Intention Behind Web Queries , 2006, SPIRE.

[18]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[19]  B. Dervin,et al.  Information needs and uses. , 1986 .

[20]  Konstantina Martzoukou,et al.  A review of Web information seeking research: considerations of method and foci of interest , 2005, Inf. Res..

[21]  David Maxwell Chickering,et al.  Intentions: a game for classifying search query intent , 2009, CHI Extended Abstracts.

[22]  T. D. Wilson,et al.  On user studies and information needs , 2006, J. Documentation.

[23]  Wessel Kraaij,et al.  A multi-dimensional model for search intent , 2011 .

[24]  Chong Wang,et al.  SPARK: Adapting Keyword Query to Semantic Search , 2007, ISWC/ASWC.

[25]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[26]  Fabrizio Silvestri,et al.  Mining Query Logs: Turning Search Usage Data into Knowledge , 2010, Found. Trends Inf. Retr..

[27]  Panos Constantopoulos,et al.  Research and Advanced Technology for Digital Libraries , 2001, Lecture Notes in Computer Science.

[28]  Soo Young Rieh,et al.  Analysis of multiple query reformulations on the web: The interactive information retrieval context , 2006, Information Processing & Management.

[29]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[30]  Ryen W. White,et al.  Predicting short-term interests using activity-based search context , 2010, CIKM.

[31]  Mounia Lalmas,et al.  Dynamics of Genre and Domain Intents , 2010, AIRS.

[32]  Alfred Binet The Mind and the Brain , 2007 .

[33]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .