This paper describes a simple distributed mechanism for caching files among a networked collection of workstations. We have implemented it as part of Sprite, a new operating system being implemented at the University of California at Berkeley. A preliminary version of Sprite is currently running on Sun-2 and Sun-3 workstations, which have about 1-2 MIPS processing power and 4-16 Mbytes of main memory. The system is targeted for workstations like these and newer models likely to become available in the near future; we expect the future machines to have at least five to ten times the processing power and main memory of our current machines, as well as small degrees of multiprocessing. We hope that Sprite will be suitable for networks of up to a few hundred of these workstations. Because of economic and environmental factors, most workstations will not have local disks; instead, large fast disks will be concentrated on a few server machines.
In Sprite, file information is cached in the main memories of both servers (workstations with disks), and clients (workstations wishing to access files on non-local disks). On machines with disks, the caches reduce disk-related delays and contention. On clients, the caches also reduce the communication delays that would otherwise be required to fetch blocks from servers. In addition, client caches reduce contention for the network and for the server machines. Since server CPUs appear to be the bottleneck in several existing network file systems [SATY85, LAZO86], client caching offers the possibility of greater system scalability as well as increased performance.
Sprite uses the file servers as centralized control points for cache consistency. Each server guarantees cache consistency for all the files on its disks, and clients deal only with the server for a file: there are no direct client-client interactions. The Sprite algorithm depends on the fact that the server is notified whenever one of its files is opened or closed, so it can detect when concurrent write-sharing is about to occur.
Sprite handles sequential write-sharing using version numbers. When a client opens a file, the server returns the current version number for the file, which the client compares to the version number associated with its cached blocks for the file. If they are different, the file must have been modified recently on some other workstation, so the client discards all of the cached blocks for the file and reloads its cache from the server when the blocks are needed. The delayed-write policy used by Sprite means that the server doesn't always have the current data for a file (the last writer need not have flushed dirty blocks back to the server when it closed the file). Servers handle this situation by keeping track of the last writer for each file; when a client other than the last writer opens the file, the server forces the last writer to write all its dirty blocks back to the server's cache. This guarantees that the server has up-to-date information for a file whenever a client needs it.
The file system module and the virtual memory module each manage a separate pool of physical memory pages. Virtual memory keeps its pages in approximate LRU order through a version of the clock algorithm [NELS86]. The file system keeps its cache blocks in perfect LRU order since all block accesses are through the “read” and “write” system calls. Each system keeps a time-of-last-access for each page or block. Whenever either module needs additional memory (because of a page fault or a miss in the file cache), it compares the age of its oldest page with the age of the oldest page from the other module. If the other module has the oldest page, then it is forced to give up that page; otherwise the module recycles its own oldest page.
We used a collection of benchmark programs to measure the performance of the Sprite file system. On average, client caching resulted in a speedup of about 10-40% for programs running on diskless workstations, relative to diskless workstations without client caches. With client caching enabled, diskless workstations completed the benchmarks only 0-12% more slowly than workstations with disks. Client caches reduced the server utilization from about 5-18% per active client to only about 1-9% per active client. Since normal users are rarely active, our measurements suggest that a single server should be able to support at least 50 clients.
We also compared the performance of Sprite to both the Andrew file system [SATY85] and Sun's Network File System (NFS) [SAND85]. We did this by executing the Andrew file system benchmark [HOWA87] concurrently on multiple Sprite clients and comparing our results to those presented in [HOWA87] for NFS and Andrew. For a single client, Sprite is about 30% faster than NFS and about 35% faster than Andrew. As the number of concurrent clients increased, the NFS server quickly saturated. The Andrew system showed the greatest scalability: each client accounted for only about 2.4% server CPU utilization, vs. 5.4% in Sprite and over 20% in NFS.
[1]
Bruce J. Walker,et al.
The LOCUS Distributed System Architecture
,
1986
.
[2]
Andrew Birrell,et al.
Implementing remote procedure calls
,
1984,
TOCS.
[3]
Steve R. Kleiman,et al.
Vnodes: An Architecture for Multiple File System Types in Sun UNIX
,
1986,
USENIX Summer.
[4]
Brent B. Welch,et al.
The Sprite Remote Procedure Call System
,
1986
.
[5]
David K. Gifford,et al.
A caching file system for a programmer's workstation
,
1985,
SOSP 1985.
[6]
Mahadev Satyanarayanan,et al.
Scale and performance in a distributed file system
,
1987,
SOSP '87.
[7]
K. Thompson,et al.
UNIX time-sharing system: UNIX implementation
,
1978,
The Bell System Technical Journal.
[8]
Dan Walsh,et al.
Design and implementation of the Sun network filesystem
,
1985,
USENIX Conference Proceedings.
[9]
Michael N Nelson.
Virtual Memory for the Sprite Operating System
,
1986
.
[10]
David K. Gifford,et al.
A caching file system for a programmer's workstation
,
1985,
SOSP '85.
[11]
John K. Ousterhout,et al.
Prefix Tables: A Simple Mechanism for Locating Files in a Distributed System
,
1985,
ICDCS.
[12]
James R. Larus,et al.
Design Decisions in SPUR
,
1986,
Computer.
[13]
Mahadev Satyanarayanan,et al.
The ITC distributed file system: principles and design
,
1985,
SOSP 1985.
[14]
Ken Thompson,et al.
The UNIX time-sharing system
,
1974,
CACM.
[15]
Paul J. Leach,et al.
The Architecture of an Integrated Local Network
,
1983,
IEEE J. Sel. Areas Commun..
[16]
Willy Zwaenepoel,et al.
File access performance of diskless workstations
,
1986,
TOCS.
[17]
John Kunze,et al.
A trace-driven analysis of the unix 4
,
1985,
SOSP 1985.
[18]
Mahadev Satyanarayanan,et al.
The ITC distributed file system: principles and design
,
1985,
SOSP '85.
[19]
Mahadev Satyanarayanan,et al.
Scale and performance in a distributed file system
,
1988,
TOCS.
[20]
A. Retrospective,et al.
The UNIX Time-sharing System
,
1977
.
[21]
Samuel J. Leffler,et al.
Measuring and Improving the Performance of 4.2BSD
,
1984
.