Empirical Study on Rare Query Characteristics

User behavior analysis has played an important role in Web information retrieval. Rare queries, whose frequencies are rather low, are usually ignored in existing studies due to the data sparseness. Little has been known about the mass of rare queries on either the information need or the user behavior. In this paper, we make an empirical study of users' behavior on rare queries using a large scale search log. Features concerning query, resource and post-query actions are analyzed, based on which we propose a practical categorization framework and obtain an overview of rare query composition. Further, we study the characteristics of several most commonly occurring types of rare queries, and suggest improving the search performance of them separately. This work gives more insights into understanding the long tail of queries and will be helpful for Web search in terms of rare queries.

[1]  Marcus Fontoura,et al.  Estimating advertisability of tail queries for sponsored search , 2010, SIGIR.

[2]  Craig Silverstein,et al.  Analysis of a Very Large Altavista Query Log" SRC Technical note #1998-14 , 1998 .

[3]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[4]  Thorsten Joachims,et al.  Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.

[5]  Yiqun Liu,et al.  Automatic search engine performance evaluation with click-through data analysis , 2007, WWW '07.

[6]  Ryen W. White,et al.  Assessing the scenic route: measuring the value of search trails in web logs , 2010, SIGIR.

[7]  Ryen W. White,et al.  Characterizing the influence of domain expertise on web search behavior , 2009, WSDM '09.

[8]  Yiqun Liu,et al.  Study on the Click Context of Web Search Users for Reliability Analysis , 2009, AIRS.

[9]  Daqing He,et al.  Detecting session boundaries from Web user logs , 2000 .

[10]  Doug Downey,et al.  Heads and tails: studies of web search with common and rare queries , 2007, SIGIR.

[11]  Xiaojie Yuan,et al.  Are click-through data adequate for learning web search rankings? , 2008, CIKM '08.

[12]  Andrei Z. Broder,et al.  Anatomy of the long tail: ordinary people with extraordinary tastes , 2010, WSDM '10.

[13]  Jingfang Xu,et al.  Learning similarity function for rare queries , 2011, WSDM '11.

[14]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[15]  Doug Downey,et al.  Understanding the relationship between searchers' queries and information goals , 2008, CIKM '08.

[16]  Yang Song,et al.  Optimal rare query suggestion with implicit user feedback , 2010, WWW '10.

[17]  Andrei Z. Broder,et al.  Online expansion of rare queries for sponsored search , 2009, WWW '09.

[18]  W. Bruce Croft,et al.  Analysis of long queries in a large scale search log , 2009, WSCD '09.

[19]  Ryen W. White,et al.  WWW 2007 / Track: Browsers and User Interfaces Session: Personalization Investigating Behavioral Variability in Web Search , 2022 .