RESQ: rank-energy selective query forwarding for distributed search systems

Selective query forwarding is a promising technique to help scale high-quality and cost-efficient query evaluation in distributed search systems. The basic idea is simple. After a local site receives a query, it determines non-local sites to forward the query to and returns an aggregation of local and non-local results. We introduce "RESQ", a hybrid rank-energy selective query forwarding model. The novel contribution of RESQ is to simultaneously consider both ranking quality and energy costs when making forwarding decisions. Using a large-scale query log and publicly-available energy price time series, we demonstrate the ability of RESQ forwarding to achieve favorable tradeoffs between the possibility of returning high ranking query results and savings in temporally- and spatially-varying energy prices.

[1]  James P. Callan,et al.  Document allocation policies for selective searching of distributed indexes , 2010, CIKM '10.

[2]  James Allan,et al.  CrowdLogging: distributed, private, and anonymous search logging , 2011, SIGIR '11.

[3]  Efthimis N. Efthimiadis,et al.  Analyzing and evaluating query reformulation strategies in web search logs , 2009, CIKM.

[4]  Xuanjing Huang,et al.  Learning hash codes for efficient content reuse detection , 2012, SIGIR '12.

[5]  Vijayalakshmi Atluri,et al.  Effective anonymization of query logs , 2009, CIKM.

[6]  BalakrishnanHari,et al.  Cutting the electric bill for internet-scale systems , 2009, SIGCOMM '09.

[7]  Jean Tague-Sutcliffe,et al.  Problems in the simulation of bibliographic retrieval systems , 1980, SIGIR '80.

[8]  Özgür Ulusoy,et al.  Characterizing web search queries that match very few or no results , 2012, CIKM '12.

[9]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[10]  Berkant Barla Cambazoglu,et al.  Quantifying performance and quality gains in distributed web search engines , 2009, SIGIR.

[11]  Roi Blanco,et al.  Energy-price-driven query processing in multi-center web search engines , 2011, SIGIR '11.

[12]  Kathryn S. McKinley,et al.  Performance evaluation of a distributed architecture for information retrieval , 1996, SIGIR '96.

[13]  Roi Blanco,et al.  Assigning documents to master sites in distributed search , 2011, CIKM '11.

[14]  Fabio Crestani,et al.  Towards query log based personalization using topic models , 2010, CIKM.

[15]  Berkant Barla Cambazoglu,et al.  A refreshing perspective of search engine caching , 2010, WWW '10.

[16]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[17]  Aristides Gionis,et al.  On the feasibility of multi-site web search engines , 2009, CIKM.

[18]  Berkant Barla Cambazoglu,et al.  Query forwarding in geographically distributed search engines , 2010, SIGIR.

[19]  Sergei Vassilvitskii,et al.  Efficiently encoding term co-occurrences in inverted indexes , 2011, CIKM '11.

[20]  P. Sreenivasa Kumar,et al.  On-line index maintenance using horizontal partitioning , 2009, CIKM.

[21]  Craig MacDonald,et al.  Load-sensitive selective pruning for distributed search , 2013, CIKM.

[22]  Gang Chen,et al.  UPS: efficient privacy protection in personalized web search , 2011, SIGIR '11.

[23]  Youngjoong Ko,et al.  A study of term weighting schemes using class information for text classification , 2012, SIGIR '12.

[24]  Torsten Suel,et al.  Improved techniques for result caching in web search engines , 2009, WWW '09.

[25]  Djoerd Hiemstra,et al.  Shard ranking and cutoff estimation for topically partitioned collections , 2012, CIKM.

[26]  Flavio Paiva Junqueira,et al.  Reactive index replication for distributed search engines , 2012, SIGIR '12.

[27]  Alistair Moffat,et al.  Load balancing for term-distributed parallel retrieval , 2006, SIGIR.

[28]  P. Sreenivasa Kumar,et al.  Index tuning for query-log based on-line index maintenance , 2011, CIKM '11.

[29]  Christoph Baumgarten,et al.  A probabilistic solution to the selection and fusion problem in distributed information retrieval , 1999, SIGIR '99.

[30]  Bruce M. Maggs,et al.  Cutting the electric bill for internet-scale systems , 2009, SIGCOMM '09.

[31]  Dimitrios Gunopulos,et al.  Answering top-k queries using views , 2006, VLDB.