Query side evaluation: an empirical analysis of effectiveness and effort

Typically, Information Retrieval evaluation focuses on measuring the performance of the system's ability at retrieving relevant information, and not the query's ability. However, the effectiveness of a retrieval system is strongly influenced by the quality of the query submitted. In this paper, the effectiveness and effort of querying is empirically examined in the context of the Principle of Least Effort, Zipf's Law and the Law of Diminishing Returns. This query focused investigation leads to a number of novel findings which should prove useful in the development of future retrieval methods and evaluation techniques. While, also motivating further research into query side evaluation.

[1]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[2]  Avi Arampatzis,et al.  A study of query length , 2008, SIGIR '08.

[3]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[4]  Naonori Ueda,et al.  Retrieving lightly annotated images using image similarities , 2005, SAC '05.

[5]  Ellen M. Voorhees,et al.  Bias and the limits of pooling , 2006, SIGIR '06.

[6]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[7]  M. de Rijke,et al.  Building simulated queries for known-item topics: an analysis using six european languages , 2007, SIGIR.

[8]  Xin Fu,et al.  The loquacious user: a document-independent source of terms for query expansion , 2005, SIGIR '05.

[9]  David C. Blair,et al.  The challenge of commercial document retrieval, Part I: Major issues, and a framework based on search exhaustivity, determinacy of representation and document collection size , 2002, Inf. Process. Manag..

[10]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[11]  Hugh E. Williams,et al.  Query association surrogates for Web search , 2004, J. Assoc. Inf. Sci. Technol..

[12]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[13]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[14]  Hugh E. Williams,et al.  Query association surrogates for Web search: Research Articles , 2004 .

[15]  Richard M. Schwartz,et al.  A hidden Markov model information retrieval system , 1999, SIGIR '99.

[16]  Qigang Gao,et al.  Using controlled query generation to evaluate blind relevance feedback algorithms , 2006, Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '06).

[17]  Alfred J. Lotka,et al.  The frequency distribution of scientific productivity , 1926 .

[18]  S. Bradford "Sources of information on specific subjects" by S.C. Bradford , 1985 .

[19]  Falk Scholer,et al.  Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence , 2008, ECIR.