A vector model for routing queries in web search engines

Abstract This paper proposes a method for reducing the number of search nodes involved in the solution of queries arriving to a Web search engine. The method is applied by the query receptionist machine during situations of sudden peaks in query trafic to reduce the load on the search nodes. The experimental evaluation based on actual traces from users of a major search engine, shows that the proposed method outperforms alternative strategies. This is more evident for systems composed of a large number of search nodes which indicates that the method is also more scalable than the alternative strategies.

[1]  Chih-Jen Lin,et al.  Trust Region Newton Method for Logistic Regression , 2008, J. Mach. Learn. Res..

[2]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[3]  Chih-Jen Lin,et al.  A sequential dual method for large scale multi-class linear svms , 2008, KDD.

[4]  Fabrizio Silvestri,et al.  Load-balancing and caching for collection selection architectures , 2007, Infoscale.

[5]  Salvatore Orlando,et al.  A metric cache for similarity search , 2008, LSDS-IR '08.

[6]  Boris Chidlovskii,et al.  Semantic caching of Web queries , 2000, The VLDB Journal.

[7]  Aristides Gionis,et al.  Design trade-offs for search engine caching , 2008, TWEB.

[8]  Chih-Jen Lin,et al.  Trust region Newton methods for large-scale logistic regression , 2007, ICML '07.

[9]  Torsten Suel,et al.  Three-Level Caching for Efficient Query Processing in Large Web Search Engines , 2005, WWW '05.

[10]  Torsten Suel,et al.  Inverted index compression and query processing with optimized document ordering , 2009, WWW '09.

[11]  Shlomo Moran,et al.  Predictive caching and prefetching of query results in search engines , 2003, WWW '03.

[12]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[13]  Boris Chidlovskii,et al.  Semantic Cache Mechanism for Heterogeneous Web Querying , 1999, Comput. Networks.

[14]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[15]  Sriram Padmanabhan,et al.  Scalable template-based query containment checking for Web semantic caches , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[16]  Mauricio Marín,et al.  A Last-Resort Semantic Cache for Web Queries , 2009, SPIRE.

[17]  Torsten Suel,et al.  Improved techniques for result caching in web search engines , 2009, WWW '09.

[18]  Fabrizio Silvestri,et al.  Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data , 2006, TOIS.

[19]  Evangelos P. Markatos,et al.  On caching search engine query results , 2001, Comput. Commun..

[20]  Veronica Gil Costa,et al.  Location cache for web queries , 2009, CIKM.

[21]  Jarek Gryz,et al.  Answering Queries by Semantic Caches , 1999, DEXA.