A status report on research in transparent informed prefetching

This paper focuses on extending the power of caching and prefetching to reduce file read latencies by exploiting application level hints about future I/O accesses. We argue that systems that disclose high-level knowledge can transfer optimization information across module boundaries in a manner consistent with sound software engineering principles. Such Transparent Informed Prefetching (TIP) systems provide a technique for converting the high throughput of new technologies such as disk arrays and log-structured file systems into low latency for applications. Our preliminary experiments show that even without a high-throughput I/O subsystem TIP yields reduced execution time of up to 30% for applications obtaining data from a remote file server and up to 13% for applications obtaining data from a single local disk. These experiments indicate that greater performance benefits will be available when TIP is integrated with low level resource management policies and highly parallel I/O subsystems such as disk arrays.

[1]  A. L. Narasimha Reddy,et al.  An Evaluation of Multiple-Disk I/O Systems , 1989, IEEE Trans. Computers.

[2]  K. Korner,et al.  Intelligent caching for remote file service , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[3]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[4]  Carla Schlatter Ellis,et al.  Practical prefetching techniques for parallel file systems , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[5]  Michelle Y. Kim,et al.  Synchronized Disk Interleaving , 1986, IEEE Transactions on Computers.

[6]  Samuel J. Leffler,et al.  A Fast File System for UNIX (Revised July 27, 1983) , 1983 .

[7]  Randy H. Katz,et al.  Input/output behavior of supercomputing applications , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[8]  Vincent Cate,et al.  Alex - a Global Filesystem , 1992 .

[9]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[10]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[11]  Christos Faloutsos,et al.  Flexible buffer allocation based on marginal gains , 1991, SIGMOD '91.

[12]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[13]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[14]  Miron Livny,et al.  Multi-disk management algorithms , 1987, SIGMETRICS '87.

[15]  Michael Stonebraker,et al.  Operating system support for database management , 1981, CACM.

[16]  Andrew S. Grimshaw,et al.  ELFS: object-oriented extensible file systems , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[17]  Philip S. Yu,et al.  Integration of Buffer Management and Query Optimization in Relational Database Environment , 1989, VLDB.

[18]  K OusterhoutJohn,et al.  Caching in the Sprite network file system , 1988 .

[19]  Mahadev Satyanarayanan,et al.  The ITC distributed file system: principles and design , 1985, SOSP 1985.

[20]  Giovanni Maria Sacco,et al.  A Mechanism for Managing the Buffer Pool in a Relational Database System Using the Hot Set Model , 1982, VLDB.

[21]  Hector Garcia-Molina,et al.  Disk striping , 1986, 1986 IEEE Second International Conference on Data Engineering.

[22]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[23]  Dan Duchamp,et al.  Detection and exploitation of file working sets , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[24]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[25]  A. J. Smith,et al.  Disk cache - Miss ratio analysis and design considerations , 1983, Perform. Evaluation.

[26]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[27]  Mahadev Satyanarayanan,et al.  Disconnected operation in the Coda File System , 1992, TOCS.

[28]  Fred Douglis,et al.  Beating the I/O bottleneck: a case for log-structured file systems , 1989, OPSR.

[29]  Stanley B. Zdonik,et al.  Fido: A Cache That Learns to Fetch , 1991, VLDB.

[30]  Garth A. Gibson Redundant disk arrays: Reliable, parallel secondary storage. Ph.D. Thesis , 1990 .

[31]  E. I. Organick,et al.  The Multics Input/Output system , 1971, SOSP '71.

[32]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[33]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.