Survey and evaluation of query intent detection methods

User interactions with search engines reveal three main underlying intents, namely navigational, informational, and transactional. By providing more accurate results depending on such query intents the performance of search engines can be greatly improved. Therefore, query classification has been an active research topic for the last years. However, while query topic classification has deserved a specific bakeoff, no evaluation campaign has been devoted to the study of automatic query intent detection. In this paper some of the available query intent detection techniques are reviewed, an evaluation framework is proposed, and it is used to compare those methods in order to shed light on their relative performance and drawbacks. As it will be shown, manually prepared gold-standard files are much needed, and traditional pooling is not the most feasible evaluation method. In addition to this, future lines of work in both query intent detection and its evaluation are proposed.

[1]  Ricardo Baeza-Yates,et al.  Analysis of Web Search Engine Query Sessions , 2006 .

[2]  Ricardo A. Baeza-Yates,et al.  Analysis of Web Search Engine Query Session and Clicked Documents , 2006, WEBKDD.

[3]  Ying Li,et al.  KDD CUP-2005 report: facing a great challenge , 2005, SKDD.

[4]  K. Sparck Jones,et al.  INFORMATION RETRIEVAL TEST COLLECTIONS , 1976 .

[5]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[6]  Nicholas J. Belkin,et al.  Relationships between categories of relevance criteria and stage in task completion , 2007, Inf. Process. Manag..

[7]  Emine Yilmaz,et al.  Estimating average precision with incomplete and imperfect judgments , 2006, CIKM '06.

[8]  Ricardo A. Baeza-Yates,et al.  The Intention Behind Web Queries , 2006, SPIRE.

[9]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[10]  Donna Harman,et al.  The Text REtrieval Conferences (TRECs) , 1996, TIPSTER.

[11]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[12]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[13]  C. J. van Rijsbergen,et al.  Report on the need for and provision of an 'ideal' information retrieval test collection , 1975 .

[14]  Ying Li,et al.  Detecting dominant locations from search queries , 2005, SIGIR '05.

[15]  Nikolai Buzikashvili Sliding window technique for the web log analysis , 2007, WWW '07.

[16]  Qiang Yang,et al.  Building bridges for web query classification , 2006, SIGIR.

[17]  Daniel Gayo-Avello,et al.  Automatic detection of navigational queries according to Behavioural Characteristics , 2008, LWA.

[18]  Luis Gravano,et al.  Categorizing web queries according to geographical locality , 2003, CIKM '03.

[19]  Qiang Yang,et al.  Query enrichment for web-query classification , 2006, TOIS.

[20]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[21]  Ellen M. Voorhees,et al.  Bias and the limits of pooling for large collections , 2007, Information Retrieval.

[22]  Yiqun Liu,et al.  Automatic Query Type Identification Based on Click Through Information , 2006, AIRS.

[23]  Andrei Z. Broder,et al.  Robust classification of rare queries using web knowledge , 2007, SIGIR.

[24]  Rajasekar Krishnamurthy,et al.  Getting work done on the web: supporting transactional queries , 2006, SIGIR '06.

[25]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[26]  Harold Borko,et al.  Automatic indexing , 1981, ACM '81.

[27]  Amanda Spink,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..

[28]  Ophir Frieder,et al.  Varying approaches to topical web query classification , 2007, SIGIR.

[29]  Mohand Boughanem,et al.  Contextual query classification in web search , 2008, LWA.

[30]  Gerard Salton,et al.  Automatic indexing , 1980, ACM '80.

[31]  Shui-Lung Chuang,et al.  Subject categorization of query terms for exploring Web users' search interests , 2002, J. Assoc. Inf. Sci. Technol..

[32]  Ying Li,et al.  Detecting online commercial intention (OCI) , 2006, WWW '06.

[33]  Ophir Frieder,et al.  Improving automatic query classification via semi-supervised learning , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[34]  Peter Bailey,et al.  Relevance assessment: are judges exchangeable and does it matter , 2008, SIGIR '08.

[35]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[36]  Kerry Rodden,et al.  Web Information Seeking and Interaction , 2007 .