Latency-aware strategy for static list caching in flash-based web search engines

Caching is a widely used technique to boost the performance of search engines. Based on the observation that the speed gap between the random access of flash-based solid state drive and its sequential access is much inapparent than that of magnetic hard disk drive, we introduce a new static list caching algorithm which takes the block-level access latency into consideration. The experimental results show that the proposed policy can reduce the average disk access latency per query by up to 14\% over the state-of-the-art algorithms in the SSD-based infrastructure. Besides, the results also reveal that our new strategy outperforms other existing algorithms even on HDD-based architecture.

[1]  Gang Wang,et al.  The impact of solid state drive on search engine cache management , 2013, SIGIR.

[2]  Özgür Ulusoy,et al.  A five-level static cache architecture for web search engines , 2012, Inf. Process. Manag..

[3]  Justin Zobel,et al.  Dynamic index pruning for effective caching , 2007, CIKM '07.

[4]  Özgür Ulusoy,et al.  Cost-Aware Strategies for Query Result Caching in Web Search Engines , 2011, TWEB.

[5]  Ricardo A. Baeza-Yates,et al.  A Three Level Search Engine Index Based in Query Log Distribution , 2003, SPIRE.

[6]  Gang Wang,et al.  hUBI: An Optimized Hybrid Mapping Scheme for NAND Flash-Based SSDs , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[7]  Evangelos P. Markatos,et al.  On caching search engine query results , 2001, Comput. Commun..

[8]  Özgür Ulusoy,et al.  Static query result caching revisited , 2008, WWW.

[9]  Ricardo Baeza-Yates,et al.  Modeling Static Caching in Web Search Engines , 2012, ECIR.

[10]  Hai Jin,et al.  An Efficient SSD-based Hybrid Storage Architecture for Large-Scale Search Engines , 2012, 2012 41st International Conference on Parallel Processing.

[11]  Fabrizio Silvestri,et al.  Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data , 2006, TOIS.

[12]  Erik D. Demaine,et al.  Experiments on Adaptive Set Intersections for Text Retrieval Systems , 2001, ALENEX.

[13]  Aristides Gionis,et al.  The impact of caching on search engines , 2007, SIGIR.