A coherent distributed file cache with directory write-behind

Extensive caching is a key feature of the Echo distributed file system. Echo client machines maintain coherent caches of file and directory data and properties, with write-behind (delayed write-back) of all cached information. Echo specifies ordering constraints on this write-behind, enabling applications to store and maintain consistent data structures in the file system even when crashes or network faults prevent some writes from being completed. In this paper we describe the Echo cache's coherence and ordering semantics, show how they can improve the performance and consistency of applications, explain how they are implemented. We also discuss the general problem of reliably notifying applications and users when write-behind is lost; we addressed this problem as part of the Echo design, but did not find a fully satisfactory solution.

[1]  Gerald J. Popek,et al.  Algorithms for Consistency in Optimistically Replicated File Systems , 1991 .

[2]  Dan Walsh,et al.  Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.

[3]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[4]  Michael Burrows,et al.  Autonet: A High-Speed, Self-Configuring Local Area Network Using Point-to-Point Links , 1991, IEEE J. Sel. Areas Commun..

[5]  Greg Thiel,et al.  LOCUS a network transparent, high reliability distributed system , 1981, SOSP.

[6]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[7]  Garret Swart,et al.  Availability in the Echo File System , 1996 .

[8]  Roy Levin,et al.  The Vesta Repository: A File System Extension for Software Development , 1993 .

[9]  Garret Swart,et al.  New-value Logging in the Echo Replicated File System , 1996 .

[10]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[11]  Bruce Walker,et al.  The LOCUS distributed operating system , 1983, SOSP '83.

[12]  William I. Nowicki,et al.  NFS: Network File System Protocol specification , 1989, RFC.

[13]  Mary Baker,et al.  Non-volatile memory for fast, reliable file systems , 1992, ASPLOS V.

[14]  Garret Swart,et al.  The Echo Distributed File System , 1996 .

[15]  Timothy P. Mann,et al.  An Algorithm for Data Replication , 1989 .

[16]  Lawrence C. Stewart,et al.  Firefly: a multiprocessor workstation , 1987, IEEE Trans. Computers.

[17]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[18]  Frank B. Schmuck,et al.  Experience with transactions in QuickSilver , 1991, SOSP '91.

[19]  David A. Goldberg,et al.  Design and Implementation of the Sun Network Filesystem , 1985, USENIX Conference Proceedings.

[20]  Garret Swart,et al.  Availability and consistency tradeoffs in the Echo distributed file system , 1989, Proceedings of the Second Workshop on Workstation Operating Systems.

[21]  Michael L. Kazar,et al.  Synchronization and Caching Issues in the Andrew File System , 1988, USENIX Winter.

[22]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[23]  John K. Ousterhout,et al.  Prefix Tables: A Simple Mechanism for Locating Files in a Distributed System , 1985, ICDCS.

[24]  Garret Swart,et al.  Granularity and semantic level of replication in the Echo distributed file system , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[25]  Michael Burrows Efficient data sharing , 1988 .

[26]  Roy Levin,et al.  The Vesta Approach to Precise Configuration of Large Software Systems , 1993 .

[27]  John Heidemann,et al.  Architecture of the Ficus Scalable Replicated File System , 1991 .

[28]  Dennis Grinberg,et al.  Hector: Connecting Words with Definitions , 1992 .

[29]  David R. Cheriton,et al.  Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[30]  Garret Swart,et al.  Some consequences of excess load on the Echo replicated file system , 1992, [1992 Proceedings] Second Workshop on the Management of Replicated Data.

[31]  Martín Abadi,et al.  Authentication in distributed systems: theory and practice , 1991, SOSP '91.

[32]  Mary Baker,et al.  The Recovery Box: Using Fast Recovery to Provide High Availability in the UNIX Environment , 1992, USENIX Summer.