Disk caching in large database and timeshared systems

We present the results of a variety of trace-driven simulations of disk cache designs using traces from a variety of mainframe timesharing and database systems in production use. We compute miss ratios, run lengths, traffic ratios, cache residency times, degree of memory pollution and other statistics for a variety of designs, varying lock size, prefetching algorithm and write algorithm. We find that for this workload, sequential prefetching produces a significant (about 20%) but still limited improvement in the miss ratio, even using a powerful technique for detecting sequentiality. Copy-back writing decreased write traffic relative to write-through by more than 50%; periodic flushing of the dirty blocks increased write traffic only slightly compared to pure write-back, and then only for large cache sizes. Write-allocate had little effect compared to no-write-allocate. Block sizes of over a track don't appear to be useful. Limiting cache occupancy by a single process or transaction appears to have little effect. This study is unique in the variety and quality of the data used in the studies.

[1]  Samuel Defazio Predictive database buffer management strategies: an empirical approach , 1988 .

[2]  Philip S. Yu,et al.  The Effect of Skewed Data Access on Buffer Hits and Data Contention an a Data Sharing Environment , 1990, VLDB.

[3]  John Wilkes,et al.  Improving the efficiency of UNIX buffer caches , 1989, SOSP '89.

[4]  Yuguang Wu Evaluation of write-back caches for multiple block-sizes , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[5]  Alexandre Brandwajn Modeling DASDs and Disk Caches , 1986, Int. CMG Conference.

[6]  Thomas M. Kroeger,et al.  Predicting file system actions from prior events , 1996 .

[7]  David K. Gifford,et al.  A caching file system for a programmer's workstation , 1985, SOSP '85.

[8]  Umpei Nagashima,et al.  An improvement of I/O function for auxiliary storage: parallel I/O for a large scale supercomputing , 1990, ICS '90.

[9]  Asit Dan,et al.  A simple analysis of the LRU buffer policy and its relationship to buffer warm-up transient , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[10]  Samuel DeFazio,et al.  Diversity in database reference behavior , 1989, SIGMETRICS '89.

[11]  B. Gopinath,et al.  Program modelling via inter-reference gaps and applications , 1995, MASCOTS '95. Proceedings of the Third International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[12]  Mark B. Friedman DASD Access Patterns , 1983, Int. CMG Conference.

[13]  David T. Harper,et al.  Performance analysis of disk cache write policies , 1995, Microprocess. Microsystems.

[14]  K OusterhoutJohn,et al.  Caching in the Sprite network file system , 1988 .

[15]  Robert J. T. Morris,et al.  Exact Analysis of Bernoulli Superposition of Streams Into a Least Recently Used Cache , 1995, IEEE Trans. Software Eng..

[16]  J. Spencer Love,et al.  Caching strategies to improve disk system performance , 1994, Computer.

[17]  Anneliese Amschler Andrews,et al.  BEST/1 Analysis of the IBM 3880-13 Cached Storage Controller , 1982, Int. CMG Conference.

[18]  Alan Jay Smith,et al.  Input/output optimization and disk architectures: A survey , 1981, Perform. Evaluation.

[19]  Alan Jay Smith,et al.  On the effectiveness of buffered and multiple arm disks , 1978, ISCA '78.

[20]  John Francis Cigas The design and evaluation of a block-level disk cache using pseudo-files , 1988 .

[21]  Valery Soloviev Prefetching in segmented disk cache for multi-disk systems , 1996, IOPADS '96.

[22]  S. Fuld The Amperif Cache Disk System , 1988, Digest of Papers. COMPCON Spring 88 Thirty-Third IEEE Computer Society International Conference.

[23]  John Turek,et al.  Optimal Partitioning of Cache Memory , 1992, IEEE Trans. Computers.

[24]  Robert S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[25]  Carol P. Grossman,et al.  Cache-DASD Storage Design for Improving System Performance , 1985, IBM Syst. J..

[26]  Alan Jay Smith,et al.  Efficient Analysis of Caching Systems , 1987 .

[27]  Edward D. Lazowska,et al.  Techniques for file system simulation , 1994, Softw. Pract. Exp..

[28]  Alan Jay Smith,et al.  Disk cache—miss ratio analysis and design considerations , 1983, TOCS.

[29]  Jai Menon,et al.  Simulation study of cached RAID5 designs , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[30]  Cyril U. Orji,et al.  Write-only disk cache experiments on multiple surface disks , 1992, Proceedings ICCI `92: Fourth International Conference on Computing and Information.

[31]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[32]  Bruce McNutt,et al.  A Simple Statistical Model of Cache Reference Locality, and its Application to Cache Planning, Measurement and Control , 1991, Int. CMG Conference.

[33]  Peter J. Denning,et al.  The Working Set Model for Program , 1968 .

[34]  Kenneth C. Sevcik,et al.  A buffer management model for use in predicting overall database system performance , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[35]  Makoto Kobayashi Dynamic Characteristics of Loops , 1984, IEEE Transactions on Computers.

[36]  Christos Faloutsos,et al.  Predictive Load Control for Flexible Buffer Allocation , 1991, VLDB.

[37]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[38]  Sanjeev Setia,et al.  Analysis of the Periodic Update Write Policy For Disk Cache , 1990, IEEE Trans. Software Eng..

[39]  R. S. Fabry,et al.  MIN—an optimal variable-space page replacement algorithm , 1976, CACM.

[40]  Anna R. Karlin,et al.  Integrated parallel prefetching and caching , 1996, SIGMETRICS '96.

[41]  Philip S. Yu,et al.  Database buffer model for the data sharing environment , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[42]  Jai Menon,et al.  The IBM 3990 disk cache , 1988, Digest of Papers. COMPCON Spring 88 Thirty-Third IEEE Computer Society International Conference.

[43]  K. K. Ramakrishnan,et al.  Trace driven analysis of write caching policies for disks , 1993, SIGMETRICS '93.

[44]  Hossam Afifi,et al.  Evaluating caching schemes for the X.500 directory , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[45]  Wolfgang Effelsberg,et al.  Principles of database buffer management , 1984, TODS.

[46]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[47]  Cyril U. Orji,et al.  Write-only disk caches , 1990, SIGMOD '90.

[48]  Jeff Moad Storage Technology Corp. , 1993 .

[49]  Mahadev Satyanarayanan,et al.  A status report on research in transparent informed prefetching , 1993, OPSR.

[50]  Roger P. Kovach DASD Cache Controllers: Performance Expectations and Measurement , 1986, Int. CMG Conference.

[51]  Joann J. Ordille,et al.  Distributed active catalogs and meta-data caching in descriptive name services , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[52]  P. Krishnan,et al.  Flash memory file caching for mobile computers , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[53]  C. Mohan,et al.  Recovery and Coherency-Control Protocols for Fast Intersystem Page Transfer and Fine-Granularity Locking in a Shared Disks Transaction Environment , 1991, VLDB.

[54]  Ronald Minnich The AutoCacher: A File Cache Which Operates at the NFS Level , 1993, USENIX Winter.

[55]  Alan Jay Smith,et al.  Sequentiality and prefetching in database systems , 1978, TODS.

[56]  Carol P. Grossman Evolution of the DASD Storage Control , 1989, IBM Syst. J..

[57]  Derek L. Eager,et al.  Disk cache performance for distributed systems , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[58]  Anthony G. Mungal Cached I/O Subsystems: Analysis and Performance , 1986, Int. CMG Conference.

[59]  J. William Atwood,et al.  CICS LSR Buffer Simulator (CLBS) , 1988, Int. CMG Conference.

[60]  Ad J. van de Goor,et al.  UNIX I/O in a Multiprocessor System , 1988, USENIX Winter.

[61]  T. Paul Lee,et al.  A Performance Study on UNIX Disk I/O Reference Trace , 1988, Int. CMG Conference.

[62]  Quinn Jacobson,et al.  Destage algorithms for disk arrays with non-volatile caches , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[63]  Mark B. Friedman Evaluation Of An Approximation Technique For Disk Cache Sizing , 1995, Int. CMG Conference.

[64]  Brent B. Welch Measured Performance of Caching in the Sprite Network File System , 1991, Comput. Syst..

[65]  Alok N. Choudhary,et al.  Implementation and evaluation of prefetching in the Intel Paragon parallel file system , 1996, Proceedings of International Conference on Parallel Processing.

[66]  J. T. Robinson,et al.  Data cache management using frequency-based replacement , 1990, SIGMETRICS '90.

[67]  Carla Schlatter Ellis,et al.  ENWRICH: a compute-processor write caching scheme for parallel file systems , 1996, IOPADS '96.

[68]  Alan Jay Smith A Modified Working Set Paging Algorithm , 1976, IEEE Transactions on Computers.

[69]  A. Hospodor Hit ratio of caching disk buffers , 1992, Digest of Papers COMPCON Spring 1992.

[70]  David R. McIntyre,et al.  Caching and other disk access avoidance techniques on personal computers , 1989, CACM.

[71]  Lee J. Scheffier Optimal folding of a paging drum in a three level memory system , 1973, SOSP 1973.

[72]  Stanley B. Zdonik,et al.  Fido: A Cache That Learns to Fetch , 1991, VLDB.

[73]  Carla Schlatter Ellis,et al.  Prefetching in File Systems for MIMD Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[74]  C. Mohan,et al.  Disk read-write optimizations and data integrity in transaction systems using write-ahead logging , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[75]  Bharat Bhargava,et al.  Efficient Implementation of Modularity in RAID , 1989 .

[76]  Willy Zwaenepoel,et al.  File access performance of diskless workstations , 1986, TOCS.

[77]  Charles A. Milligan,et al.  Performance Prediction and Validation of Interacting Multiple Subsystems in Skew-Loaded Cached DASD , 1983, Int. CMG Conference.

[78]  Harold S. Stone,et al.  Improving Disk Cache Hit-Ratios Through Cache Partitioning , 1992, IEEE Trans. Computers.

[79]  Randy H. Katz,et al.  Disk system architectures for high performance computing , 1989, Proc. IEEE.

[80]  Tze Chiang Lee,et al.  A file-based adaptive prefetch caching design , 1990, Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[81]  Garret Swart,et al.  A coherent distributed file cache with directory write-behind , 1994, TOCS.

[82]  James K. Archibald,et al.  Multiple Prefetch Adaptive Disk Caching , 1993, IEEE Trans. Knowl. Data Eng..

[83]  Richard B. Bunt,et al.  Disk cache replacement policies for network fileservers , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[84]  Spencer W. Ng,et al.  Improving Disk Performance Via Latency Reduction , 1991, IEEE Trans. Computers.

[85]  Peter J. Denning,et al.  The working set model for program behavior , 1968, CACM.

[86]  Mary Baker,et al.  Availability in the Sprite distributed file system , 1991, OPSR.

[87]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[88]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[89]  E. F. DAzevedo,et al.  EDONIO: Extended distributed object network I/O library , 1995 .

[90]  Michael J. Flynn,et al.  Strategies to improve I/O cache performance , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.