A Replacement Algorithm Designed for the Web Search Engine and Its Application in Storage Cache

With popularity of different kind of search engines on WWW, it requires the backend storage system to provide better physical I/O performance to speedup the query service perceived by end users. However, existing general purpose designed replacement algorithm can’t performs well for the web search applications. This paper first studies the access pattern of various real-life web search workload and then propose a new replacement algorithm RED-LRU based on the observed access properties. The simulation results shows that our proposed algorithm uniformly outperform the other replacement algorithms for all the workloads and cache size. To validate the simulation results, we integrate RED-LRU algorithm into a real storage cache DPCache. The experiment results in real system confirm the effectiveness of our proposed algorithm in improving the caching performance for web search application. Moreover, the runtime overhead of RED-LRU is also fairly low in practice.

[1]  Dennis Shasha,et al.  2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm , 1994, VLDB.

[2]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[3]  Yuanyuan Zhou,et al.  Eviction-based Cache Placement for Storage Caches , 2003, USENIX Annual Technical Conference, General Track.

[4]  David Hawking Web search engines. Part 1 , 2006, Computer.

[5]  Richard B. Bunt,et al.  Disk cache replacement policies for network fileservers , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[6]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[7]  Ali R. Butt,et al.  FlexiCache: a flexible interface for customizing Linux file system buffer cache replacement policies , 2007 .

[8]  J. Spencer Love,et al.  Caching strategies to improve disk system performance , 1994, Computer.

[9]  Song Jiang,et al.  LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance , 2002, SIGMETRICS '02.

[10]  Harold S. Stone,et al.  Improving Disk Cache Hit-Ratios Through Cache Partitioning , 1992, IEEE Trans. Computers.

[11]  Qing Yang,et al.  A Case for Continuous Data Protection at Block Level in Disk Array Storages , 2009, IEEE Transactions on Parallel and Distributed Systems.

[12]  John Wilkes,et al.  My Cache or Yours? Making Storage More Exclusive , 2002, USENIX Annual Technical Conference, General Track.

[13]  Xudong Zhu,et al.  AVSS: An Adaptable Virtual Storage System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[14]  Y. Charlie Hu,et al.  The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms , 2005, IEEE Transactions on Computers.

[15]  Xu Lu Storage Service-oriented Buffer Management Model , 2009 .

[16]  Peter J. Denning,et al.  Virtual memory , 1970, CSUR.

[17]  Yuanyuan Zhou,et al.  Second-level buffer cache management , 2004, IEEE Transactions on Parallel and Distributed Systems.

[18]  Daniel Pierre Bovet,et al.  Understanding the Linux Kernel , 2000 .

[19]  Chengxiang Si,et al.  A Flexible Two-Layer Buffer Caching Scheme for Shared Storage Cache , 2009, 2009 11th IEEE International Conference on High Performance Computing and Communications.