A quantitative analysis of cache policies for scalable network file systems

Current network file system protocols rely heavily on a central server to coordinate file activity among client workstations. This central server can become a bottleneck that limits scalability for environments with large numbers of clients. In central server systems such as NFS and AFS, all client writes, cache misses, and coherence messages are handled by the server. To keep up with this workload, expensive server machines are needed, configured with high-performance CPUs, memory systems, and I/O channels. Since the server stores all data, it must be physically capable of connecting to many disks. This reliance on a central server also makes current systems inappropriate for wide area network use where the network bandwidth to the server may be limited. In this paper, we investigate the quantitative performance effect of moving as many of the server responsibilities as possible to client workstations to reduce the need for high-performance server machines. We have devised a cache protocol in which all data reside on clients and all data transfers proceed directly from client to client. The server is used only to coordinate these data transfers. This protocol is being incorporated as part of our experimental file system, xFS. We present results from a trace-driven simulation study of the protocol using traces from a 237 client NFS installation. We find that the xFS protocol reduces server load by more than a factor of six compared to AFS without significantly affecting response time or file availability.

[1]  Evgenia Smirni,et al.  The KSR1: experimentation and modeling of poststore , 1993, SIGMETRICS '93.

[2]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[3]  Michael N. Nelson,et al.  Caching in the Sprite network file system , 1988, TOCS.

[4]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[5]  Matt Blaze,et al.  Dynamic hierarchical caching in large-scale distributed file systems , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.

[6]  SatyanarayananMahadev Scalable, Secure, and Highly Available Distributed File Access , 1990 .

[7]  Mahadev Satyanarayanan,et al.  Disconnected operation in the Coda File System , 1992, TOCS.

[8]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[9]  Dan Walsh,et al.  Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.

[10]  Erik Hagersten,et al.  DDM - A Cache-Only Memory Architecture , 1992, Computer.

[11]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[12]  K OusterhoutJohn,et al.  Caching in the Sprite network file system , 1988 .

[13]  CORPORATE NIST The digital signature standard , 1992, CACM.

[14]  Peter Honeyman,et al.  Multi-level Caching in Distributed File Systems or Your cache ain't nuthin' but trash , 1992 .

[15]  David A. Patterson,et al.  A new approach to I/O performance evaluation: self-scaling I/O benchmarks, predicted I/O performance , 1993, SIGMETRICS '93.

[16]  Mahadev Satyanarayanan,et al.  Integrating security in a large distributed system , 1989, TOCS.

[17]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[18]  Kathy Benninger,et al.  An AFS-based supercomputing environment , 1993, [1993] Proceedings Twelfth IEEE Symposium on Mass Storage systems.

[19]  David A. Patterson,et al.  A new approach to I/O performance evaluation: self-scaling I/O benchmarks, predicted I/O performance , 1994, TOCS.

[20]  Rafael Alonso,et al.  Long-Term Caching Strategies for Very Large Distributed File Systems , 1991, USENIX Summer.

[21]  Harjinder S. Sandhu,et al.  Cluster-based file replication in large-scale distributed systems , 1992, SIGMETRICS '92/PERFORMANCE '92.

[22]  Robert A. Coyne,et al.  An introduction to the Mass Storage System Reference Model, version 5 , 1993, [1993] Proceedings Twelfth IEEE Symposium on Mass Storage systems.

[23]  Joel L. Wolf,et al.  The placement optimization program: a practical solution to the disk file assignment problem , 1989, SIGMETRICS '89.

[24]  Songnian Zhou,et al.  Implementation and performance of cluster-based file replication in large-scale distributed systems , 1992, [1992 Proceedings] Second Workshop on the Management of Replicated Data.

[25]  Garret Swart,et al.  The Echo Distributed File System , 1996 .

[26]  Jeffrey C. Mogul,et al.  The packer filter: an efficient mechanism for user-level network code , 1987, SOSP '87.

[27]  Ronald L. Rivest,et al.  The MD4 Message-Digest Algorithm , 1990, RFC.

[28]  BaerJean-Loup,et al.  Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986 .

[29]  ZwaenepoelWilly,et al.  File access performance of diskless workstations , 1986 .

[30]  Randy H. Katz,et al.  Rob-line Storage: Low Latency, High Capacity , 1991 .

[31]  Alan Jay Smith,et al.  Efficient Analysis of Caching Systems , 1987 .

[32]  Mahadev Satyanarayanan,et al.  Scalable, secure, and highly available distributed file access , 1990, Computer.

[33]  James K. Archibald,et al.  Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986, TOCS.

[34]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[35]  Mahadev Satyanarayanan,et al.  A Usage Profile and Evaluation of a Wide-Area Distributed File System , 1994, USENIX Winter.

[36]  Randy H. Katz,et al.  Robo-line Storage: Low Latency, High Capacity Storage Systems over , 1991 .

[37]  Matthew Addison Blaze Caching in large-scale distributed file systems , 1993 .

[38]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.