Exploiting the non-determinism and asynchrony of set iterators to reduce aggregate file I/O latency

A key goal of distributed systems is to provide prompt access to shared information repositories. The high latency ofremote access is a serious impediment to this goal. This paper describes a new file system abstraction called dynamic sets - unordered collections created by an application to hold the files it intends to process. Applications that iterate on the set to access its members allow the system to reduce the aggregate I/O latency by exploiting the non-determinism and asychrony inherent in the semantics of set iterators. This reduction in latency comes without relying on reference locality, without modifying DFS servers and protocols, and without unduly complicating the programming model. This paper presents this abstraction and describes an implementation of it that runs on local and distributed file systems, as well as the World Wide Web. Dynamic sets demonstrate substantial performance gains - up to 50% savings in runtime for search on NFS, and up to 90% reduction in 1/O latency for Web searches.

[1]  Herbert Schorr,et al.  Proceedings of the fourth ACM symposium on Operating system principles , 1973 .

[2]  Mary Shaw,et al.  Abstraction and verification in Alphard: Defining and specifying iteration and generators , 1977 .

[3]  Mahadev Satyanarayanan,et al.  A study of file sizes and functional lifetimes , 1981, SOSP.

[4]  Alan Jay Smith,et al.  Disk cache—miss ratio analysis and design considerations , 1983, TOCS.

[5]  John A. Kunze,et al.  A trace-driven analysis of the UNIX 4.2 BSD file system , 1985, SOSP '85.

[6]  David A. Goldberg,et al.  Design and Implementation of the Sun Network Filesystem , 1985, USENIX Conference Proceedings.

[7]  Maurice J. Bach The Design of the UNIX Operating System , 1986 .

[8]  Barbara Liskov,et al.  Abstraction and Specification in Program Development , 1986 .

[9]  Steve R. Kleiman,et al.  Vnodes: An Architecture for Multiple File System Types in Sun UNIX , 1986, USENIX Summer.

[10]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[11]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[12]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[13]  V. Stavridou,et al.  Abstraction and specification in program development , 1988 .

[14]  D. W. Topham Introduction to the C Shell , 1990 .

[15]  Mahadev Satyanarayanan,et al.  Efficient User-Level File Cache Management on the Sun Vnode Interface , 1990, USENIX Summer.

[16]  Pierre Jouvelot,et al.  Semantic file systems , 1991, SOSP '91.

[17]  Andrew S. Grimshaw,et al.  ELFS: object-oriented extensible file systems , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[18]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[19]  Dan Duchamp,et al.  Detection and exploitation of file working sets , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[20]  Carla Schlatter Ellis,et al.  Practical prefetching techniques for parallel file systems , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[21]  P. Krishnan,et al.  Practical prefetching via data compression , 1993 .

[22]  Kai Li,et al.  Implementation and performance of application-controlled file caching , 1994, OSDI '94.

[23]  Geoffrey H. Kuenning,et al.  The Design of the SEER Predictive Caching System , 1994, 1994 First Workshop on Mobile Computing Systems and Applications.

[24]  Udi Manber,et al.  GLIMPSE: A Tool to Search Through Entire File Systems , 1994, USENIX Winter.

[25]  Mahadev Satyanarayanan,et al.  A Usage Profile and Evaluation of a Wide-Area Distributed File System , 1994, USENIX Winter.

[26]  Steven Glassman,et al.  A Caching Relay for the World Wide Web , 1994, Comput. Networks ISDN Syst..

[27]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[28]  M. Frans Kaashoek,et al.  Rover: a toolkit for mobile information access , 1995, SOSP.

[29]  David C. Steere,et al.  Specifying weak sets , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[30]  Anna R. Karlin,et al.  A study of integrated prefetching and caching strategies , 1995, SIGMETRICS '95/PERFORMANCE '95.

[31]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[32]  Brian N. Bershad,et al.  A trace-driven comparison of algorithms for parallel prefetching and caching , 1996, OSDI '96.

[33]  Todd C. Mowry,et al.  Automatic compiler-inserted I/O prefetching for out-of-core applications , 1996, OSDI '96.

[34]  Ken Arnold,et al.  The Java Programming Language , 1996 .

[35]  Randy Appleton,et al.  The Design, Implementation, and Evaluation of a Predictive Caching File System , 1996 .

[36]  Jeffrey C. Mogul,et al.  Using predictive prefetching to improve World Wide Web latency , 1996, CCRV.

[37]  James Gosling,et al.  The Java Programming Language" The Java Series , 1996 .

[38]  P. Krishnan,et al.  Optimal prefetching via data compression , 1996, JACM.

[39]  Jonathan Walpole,et al.  Quality of Service Specification for Resource Management in Multimedia Systems , 1996 .

[40]  Mahadev Satyanarayanan,et al.  Using dynamic sets to reduce the aggregate latency of data access , 1997 .

[41]  Steffen Rothkugel,et al.  Enhancing the Web's Infrastructure: From Caching to Replication , 1997, IEEE Internet Comput..

[42]  Calton Pu,et al.  Microlanguages for Operating System Specialization , 1997 .

[43]  William Joy,et al.  An Introduction to the C shell , 1998 .

[44]  Ken Arnold,et al.  The Java programming language (2nd ed.) , 1998 .