Which vertical search engines are relevant?

Aggregating search results from a variety of heterogeneous sources, so-called verticals, such as news, image and video, into a single interface is a popular paradigm in web search. Current approaches that evaluate the effectiveness of aggregated search systems are based on rewarding systems that return highly relevant verticals for a given query, where this relevance is assessed under different assumptions. It is difficult to evaluate or compare those systems without fully understanding the relationship between those underlying assumptions. To address this, we present a formal analysis and a set of extensive user studies to investigate the effects of various assumptions made for assessing query vertical relevance. A total of more than 20,000 assessments on 44 search tasks across 11 verticals are collected through Amazon Mechanical Turk and subsequently analysed. Our results provide insights into various aspects of query vertical relevance and allow us to explain in more depth as well as questioning the evaluation results published in the literature.

[1]  Markus Schulze,et al.  A new monotonic, clone-independent, reversal symmetric, and condorcet-consistent single-winner election method , 2011, Soc. Choice Welf..

[2]  Martin Halvey,et al.  Assessing and Predicting Vertical Intent for Web Queries , 2012, ECIR.

[3]  David Hawking,et al.  Server selection methods in hybrid portal search , 2005, SIGIR '05.

[4]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[5]  Fernando Diaz,et al.  Integration of news content into web results , 2009, WSDM '09.

[6]  Ellen M. Voorhees Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[7]  Tapas Kanungo,et al.  On composition of a federated web search result page: using online users to provide pairwise preference for heterogeneous verticals , 2011, WSDM '11.

[8]  Ke Zhou,et al.  Evaluating large-scale distributed vertical search , 2011, LSDS-IR '11.

[9]  Milad Shokouhi,et al.  Federated Search , 2011, Found. Trends Inf. Retr..

[10]  Fernando Diaz,et al.  Learning to aggregate vertical results into web search results , 2011, CIKM '11.

[11]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[12]  Joemon M. Jose,et al.  Evaluating reward and risk for vertical selection , 2012, CIKM '12.

[13]  Fernando Diaz,et al.  Vertical selection in the presence of unlabeled verticals , 2010, SIGIR '10.

[14]  Joemon M. Jose,et al.  Evaluating aggregated search pages , 2012, SIGIR '12.

[15]  Robert Villa,et al.  Factors affecting click-through behavior in aggregated search interfaces , 2010, CIKM.

[16]  Fernando Diaz,et al.  A Methodology for Evaluating Aggregated Search Results , 2011, ECIR.

[17]  Fernando Diaz,et al.  Sources of evidence for vertical selection , 2009, SIGIR.

[18]  Mark Sanderson,et al.  Do user preferences and evaluation measures line up? , 2010, SIGIR.