Practical prefetching via data compression

An important issue that affects response time performance in current OODB and hypertext systems is the I/O involved in moving objects from slow memory to cache. A promising way to tackle this problem is to use prefetching, in which we predict the user's next page requests and get those pages into cache in the background. Current databases perform limited prefetching using techniques derived from older virtual memory systems. A novel idea of using data compression techniques for prefetching was recently advocated in [KrV, ViK], in which prefetchers based on the Lempel-Ziv data compressor (the UNIX compress command) were shown theoretically to be optimal in the limit. In this paper we analyze the practical aspects of using data compression techniques for prefetching. We adapt three well-known data compressors to get three simple, deterministic, and universal prefetchers. We simulate our prefetchers on sequences of page accesses derived from the OO1 and OO7 benchmarks and from CAD applications, and demonstrate significant reductions in fault-rate. We examine the important issues of cache replacement, size of the data structure used by the prefetcher, and problems arising from bursts of “fast” page requests (that leave virtually no time between adjacent requests for prefetching and book keeping). We conclude that prediction for prefetching based on data compression techniques holds great promise.

[1]  David J. DeWitt,et al.  The 007 Benchmark , 1993, SIGMOD '93.

[2]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[3]  Ian H. Witten,et al.  Text Compression , 1990, 125 Problems in Text Algorithms.

[4]  Amos Fiat,et al.  Competitive Paging Algorithms , 1991, J. Algorithms.

[5]  James T. Brady,et al.  A Theory of Productivity in the Creative Process , 1986, IEEE Computer Graphics and Applications.

[6]  Anne Rogers,et al.  Software support for speculative loads , 1992, ASPLOS V.

[7]  Michael Stonebraker Proceedings of the 1983 ACM SIGMOD international conference on Management of data , 1983, SIGMOD 1992.

[8]  Gaetano Borriello,et al.  Practical dictionary management for hardware data compression , 1992, CACM.

[9]  Anoop Gupta,et al.  Design and evaluation of a compiler algorithm for prefetching , 1992, ASPLOS V.

[10]  Laszlo A. Belady,et al.  A Study of Replacement Algorithms for Virtual-Storage Computer , 1966, IBM Syst. J..

[11]  Stanley B. Zdonik,et al.  Predictive Caching , 1990 .

[12]  Jean-Loup Baer,et al.  Reducing memory latency via non-blocking and prefetching caches , 1992, ASPLOS V.

[13]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.

[14]  Jeffrey F. Naughton,et al.  On the performance of object clustering techniques , 1992, SIGMOD '92.

[15]  Jeffrey Scott Vitter,et al.  Analysis of arithmetic coding for data compression , 1991, [1991] Proceedings. Data Compression Conference.

[16]  Stanley B. Zdonik,et al.  Fido: A Cache That Learns to Fetch , 1991, VLDB.

[17]  Carla Schlatter Ellis,et al.  Prefetching in File Systems for MIMD Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[18]  R. G. G. Cattell,et al.  Object operations benchmark , 1992, TODS.

[19]  James A. Storer,et al.  Data Compression: Methods and Theory , 1987 .

[20]  P. Krishnan,et al.  Optimal prefetching via data compression , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[21]  Ian H. Witten,et al.  Arithmetic coding for data compression , 1987, CACM.

[22]  Glen G. Langdon,et al.  An Introduction to Arithmetic Coding , 1984, IBM J. Res. Dev..