Query Scheduling Techniques and Power/Latency Trade-off Model for Large-Scale Search Engines

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Kathryn S. McKinley,et al.  Partial collection replication versus caching for information retrieval systems , 2000, SIGIR '00.

[3]  Mor Harchol-Balter,et al.  How data center size impacts the effectiveness of dynamic power management , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[4]  Andrei Z. Broder,et al.  Efficient query evaluation using a two-level retrieval process , 2003, CIKM '03.

[5]  Yixin Diao,et al.  Feedback Control of Computing Systems , 2004 .

[6]  William Webber,et al.  Design and Evaluation of a Pipelined Distributed Information Retrieval Architecture , 2007 .

[7]  Christos Kozyrakis,et al.  Full-System Power Analysis and Modeling for Server Environments , 2006 .

[8]  Kathryn S. McKinley,et al.  Evaluating the performance of distributed architectures for information retrieval using a variety of workloads , 2000, TOIS.

[9]  Craig MacDonald,et al.  A self-adapting latency/power tradeoff model for replicated search engines , 2014, WSDM.

[10]  Stephen E. Robertson,et al.  Parallel search using partitioned inverted files , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[11]  Yixin Diao,et al.  Using MIMO feedback control to enforce policies for interrelated metrics with application to the Apache Web server , 2002, NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327).

[12]  Craig MacDonald,et al.  Learning to predict response times for online query scheduling , 2012, SIGIR '12.

[13]  Ann L. Chervenak,et al.  Performance Measurements of the First RAID Prototype , 1990 .

[14]  Howard R. Turtle,et al.  Query Evaluation: Strategies and Optimizations , 1995, Inf. Process. Manag..

[15]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[16]  Alistair Moffat,et al.  Self-indexing inverted files for fast text retrieval , 1996, TOIS.

[17]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[18]  Michael Persin,et al.  Document filtering for fast ranking , 1994, SIGIR '94.

[19]  edited by Jospeh Y-T. Leung,et al.  Handbook of scheduling , 2013 .

[20]  N. Ziviani,et al.  Distributed query processing using partitioned inverted files , 2001, Proceedings Eighth Symposium on String Processing and Information Retrieval.

[21]  Craig MacDonald,et al.  Upper-bound approximations for dynamic pruning , 2011, TOIS.

[22]  Fabrizio Silvestri,et al.  Prefetching query results and its impact on search engines , 2012, SIGIR '12.

[23]  U. Narayan Bhat,et al.  An Introduction to Queueing Theory , 2008 .

[24]  Lidan Wang,et al.  Learning to efficiently rank , 2010, SIGIR.

[25]  Iadh Ounis,et al.  Query efficiency prediction for dynamic pruning , 2011, LSDS-IR '11.

[26]  Hilary Hutchinson,et al.  User Preference and Search Engine Latency , 2008 .

[27]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[28]  Berkant Barla Cambazoglu,et al.  Early exit optimizations for additive machine learned ranking systems , 2010, WSDM '10.

[29]  Ricardo A. Baeza-Yates,et al.  Analyzing imbalance among homogeneous index servers in a web search system , 2007, Inf. Process. Manag..

[30]  Iadh Ounis,et al.  A case study of distributed information retrieval architectures to index one terabyte of text , 2005, Inf. Process. Manag..

[31]  Veronica Gil Costa,et al.  New caching techniques for web search engines , 2010, HPDC '10.

[32]  Xue Liu,et al.  Challenges Towards Elastic Power Management in Internet Data Centers , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems Workshops.

[33]  Gobinda G. Chowdhury,et al.  An agenda for green information retrieval research , 2012, Inf. Process. Manag..

[34]  Iadh Ounis,et al.  Performance analysis of distributed information retrieval architectures using an improved network simulation model , 2007, Inf. Process. Manag..

[35]  Giorgio Gambosi,et al.  FUB, IASI-CNR and University of Tor Vergata at TREC 2008 Blog Track , 2008, TREC.

[36]  Joseph L. Hellerstein,et al.  Using Control Theory to Achieve Service Level Objectives In Performance Management , 2002, Real-Time Systems.

[37]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[38]  Charles L. A. Clarke,et al.  Information Retrieval - Implementing and Evaluating Search Engines , 2010 .

[39]  Abdur Chowdhury,et al.  Operational requirements for scalable search systems , 2003, CIKM '03.

[40]  Niraj Tolia,et al.  Opportunities and challenges to unify workload, power, and cooling management in data centers , 2010, OPSR.

[41]  Alistair Moffat,et al.  Load balancing for term-distributed parallel retrieval , 2006, SIGIR.

[42]  Tim Berners-Lee,et al.  The World-Wide Web , 1992, CACM.

[43]  Alistair Moffat,et al.  Pruned query evaluation using pre-computed impacts , 2006, SIGIR.

[44]  Fabrizio Silvestri,et al.  Mining Query Logs: Turning Search Usage Data into Knowledge , 2010, Found. Trends Inf. Retr..

[45]  Craig MacDonald,et al.  Load-sensitive selective pruning for distributed search , 2013, CIKM.

[46]  Moreno Marzolla,et al.  libcppsim: A Simula-like, Portable Process-Oriented Simulation Library in C++ , 2004 .

[47]  Craig MacDonald,et al.  Scheduling queries across replicas , 2012, SIGIR '12.

[48]  Craig MacDonald,et al.  The voting model for people search , 2009, SIGF.

[49]  Hector Garcia-Molina,et al.  Performance of Inverted Indices in Distributed Text Document Retrieval Systems , 1993 .

[50]  Iadh Ounis,et al.  Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[51]  Craig MacDonald,et al.  Hybrid Query Scheduling for a Replicated Search Engine , 2013, ECIR.

[52]  Jeffrey Dean,et al.  Challenges in building large-scale information retrieval systems: invited talk , 2009, WSDM '09.

[53]  Marcus Fontoura,et al.  Evaluation strategies for top-k queries over memory-resident inverted indexes , 2011, Proc. VLDB Endow..

[54]  D. Kendall Stochastic Processes Occurring in the Theory of Queues and their Analysis by the Method of the Imbedded Markov Chain , 1953 .

[55]  Eugene Kharitonov,et al.  Incorporating Efficiency in Evaluation , 2013 .

[56]  Forbes J. Burkowski Retrieval performance of a distributed text database utilizing a parallel processor document server , 1990, DPDS '90.

[57]  Iadh Ounis,et al.  Performance Comparison of Clustered and Replicated Information Retrieval Systems , 2007, ECIR.

[58]  Berthier A. Ribeiro-Neto,et al.  Query performance for tightly coupled distributed digital libraries , 1998, DL '98.

[59]  Salim Hariri,et al.  Autonomic power and performance management for computing systems , 2006, 2006 IEEE International Conference on Autonomic Computing.

[60]  Jie Lu,et al.  Content-based retrieval in hybrid peer-to-peer networks , 2003, CIKM '03.

[61]  Victor Carneiro,et al.  Performance Evaluation of Large-scale Information Retrieval Systems Scaling Down , 2010, LSDS-IR@SIGIR.

[62]  Mauricio Marín,et al.  High-performance distributed inverted files , 2007, CIKM '07.

[63]  Subhajyoti Bandyopadhyay,et al.  Cloud computing - The business perspective , 2011, Decis. Support Syst..

[64]  Cristina V. Lopes,et al.  Bagging gradient-boosted trees for high precision, low variance ranking models , 2011, SIGIR.

[65]  H. Lutfiyya,et al.  Dynamic Provisioning of Resources in Data Centers , 2007, Third International Conference on Autonomic and Autonomous Systems (ICAS'07).

[66]  Christoforos E. Kozyrakis,et al.  Automatic power management schemes for Internet servers and data centers , 2005, GLOBECOM '05. IEEE Global Telecommunications Conference, 2005..

[67]  Roi Blanco,et al.  Energy-price-driven query processing in multi-center web search engines , 2011, SIGIR '11.

[68]  Craig MacDonald,et al.  Query Processing in Highly-Loaded Search Engines , 2013, SPIRE.

[69]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[70]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[71]  Craig MacDonald,et al.  From Puppy to Maturity: Experiences in Developing Terrier , 2012, OSIR@SIGIR.

[72]  Byeong-Soo Jeong,et al.  Inverted File Partitioning Schemes in Multiple Disk Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[73]  Susan T. Dumais,et al.  Modeling and predicting behavioral dynamics on the web , 2012, WWW.

[74]  Robert B. Cooper,et al.  Queueing Theory , 2014, Encyclopedia of Social Network Analysis and Mining.

[75]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[76]  Özgür Ulusoy,et al.  A financial cost metric for result caching , 2013, SIGIR.

[77]  Alistair Moffat,et al.  A pipelined architecture for distributed text query evaluation , 2007, Information Retrieval.

[78]  Iadh Ounis,et al.  Query performance prediction , 2006, Inf. Syst..