The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms

A fundamental challenge in improving file system performance is to design effective block replacement algorithms to minimize buffer cache misses. Despite the well-known interactions between prefetching and caching, almost all buffer cache replacement algorithms have been proposed and studied comparatively, without taking into account file system prefetching, which exists in all modern operating systems. This paper shows that such kernel prefetching can have a significant impact on the relative performance in terms of the number of actual disk l/Os of many well-known replacement algorithms; it can not only narrow the performance gap but also change the relative performance benefits of different algorithms. Moreover, since prefetching can increase the number of blocks clustered for each disk I/O and, hence, the time to complete the I/O, the reduction in the number of disk l/Os may not translate into proportional reduction in the total I/O time. These results demonstrate the importance of buffer caching research taking file system prefetching into consideration and comparing the actual disk l/Os and the execution time under different replacement algorithms.

[1]  Daniel A. Reed,et al.  ARIMA time series modeling and forecasting for adaptive I/O prefetching , 2001, ICS '01.

[2]  Dennis Shasha,et al.  2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm , 1994, VLDB.

[3]  Todd C. Mowry,et al.  Automatic compiler-inserted I/O prefetching for out-of-core applications , 1996, OSDI '96.

[4]  Andy Oram,et al.  Understanding the Linux Kernel, Second Edition , 2002 .

[5]  Garth A. Gibson,et al.  Automatic I/O hint generation through speculative execution , 1999, OSDI '99.

[6]  Udi Manber,et al.  GLIMPSE: A Tool to Search Through Entire File Systems , 1994, USENIX Winter.

[7]  Sang Lyul Min,et al.  Towards application/file-level characterization of block references: a case for fine-grained buffer management , 2000, SIGMETRICS '00.

[8]  Andrew Tomkins,et al.  Informed multi-process prefetching and caching , 1997, SIGMETRICS '97.

[9]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[10]  Anna R. Karlin,et al.  A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.

[11]  Yuanyuan Zhou,et al.  The Multi-Queue Replacement Algorithm for Second Level Buffer Caches , 2001, USENIX Annual Technical Conference, General Track.

[12]  Olivier Temam,et al.  An Algorithm for Optimally Exploiting Spatial and Temporal Locality in Upper Memory Levels , 1999, IEEE Trans. Computers.

[13]  Todd C. Mowry,et al.  Compiler-based I/O prefetching for out-of-core applications , 2001, TOCS.

[14]  Anna R. Karlin,et al.  Near-Optimal Parallel Prefetching and Caching , 2000, SIAM J. Comput..

[15]  Song Jiang,et al.  LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance , 2002, SIGMETRICS '02.

[16]  Peter J. Varman,et al.  Optimal prefetching and caching for parallel I/O sytems , 2001, SPAA '01.

[17]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[18]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[19]  Y. Charlie Hu,et al.  Program-Counter-Based Pattern Classification in Buffer Caching , 2004, OSDI.

[20]  A. Chervenak,et al.  A Cost-Benefit Scheme for High Performance Predictive Prefetching , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[21]  Y. Charlie Hu,et al.  The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms , 2005, IEEE Transactions on Computers.

[22]  Song Jiang,et al.  CLOCK-Pro: An Effective Improvement of the CLOCK Replacement , 2005, USENIX ATC, General Track.

[23]  Sang Lyul Min,et al.  LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies , 2001, IEEE Trans. Computers.

[24]  Gerhard Weikum,et al.  An optimality proof of the LRU-K page replacement algorithm , 1999, JACM.

[25]  Brian N. Bershad,et al.  A trace-driven comparison of algorithms for parallel prefetching and caching , 1996, OSDI '96.

[26]  Susanne Albers,et al.  Integrated prefetching and caching in single and parallel disk systems , 2003, SPAA '03.

[27]  Sang Lyul Min,et al.  Design, Implementation, and Performance Evaluation of a Detection-Based Adaptive Block Replacement Scheme , 2002, IEEE Trans. Computers.

[28]  Xiaoning Ding,et al.  DULO: an effective buffer cache management scheme to exploit both temporal and spatial locality , 2005, FAST'05.

[29]  Sang Lyul Min,et al.  A low-overhead high-performance unified buffer management scheme that exploits sequential and looping references , 2000, OSDI.

[30]  Anna R. Karlin,et al.  Implementation and performance of integrated application-controlled file caching, prefetching, and disk scheduling , 1996, TOCS.

[31]  Yannis Smaragdakis,et al.  EELRU: simple and effective adaptive page replacement , 1999, SIGMETRICS '99.

[32]  Dharmendra S. Modha,et al.  WOW: wise ordering for writes - combining spatial and temporal locality in non-volatile caches , 2005, FAST'05.

[33]  Sang Lyul Min,et al.  An Implementation Study of a Detection-Based Adaptive Block Replacement Scheme , 1999, USENIX Annual Technical Conference, General Track.

[34]  Dharmendra S. Modha,et al.  CAR: Clock with Adaptive Replacement , 2004, FAST.

[35]  Gerhard Weikum,et al.  The LRU-K page replacement algorithm for database disk buffering , 1993, SIGMOD Conference.

[36]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[37]  John L. Hennessy,et al.  WSCLOCK—a simple and effective algorithm for virtual memory management , 1981, SOSP.

[38]  Pei Cao,et al.  Adaptive page replacement based on memory reference behavior , 1997, SIGMETRICS '97.