A comparison of two approaches to build reliable distributed file servers

Several existing distributed file systems provide reliability by server replication. An alternative approach is to use dual-ported disks accessible to a server and a backup. The two approaches are compared by examining an example of each. Deceit is a replicated file server that emphasizes flexibility. HA-NFS is an example of the second approach that emphasizes efficiency and simplicity. The two file servers run on the same hardware and implement SUN's NFS protocol. The comparison shows that replicated servers are more flexible and tolerant of a wider variety of faults. On the other hand, the dual-ported disks approach is more efficient and simpler to implement. When tolerating single failure, dual-ported disks also give somewhat better availability.<<ETX>>

[1]  Anupam Bhide,et al.  A Highly Available Network File Server , 1991, USENIX Winter.

[2]  Kenneth P. Birman,et al.  Fast causal multicast , 1990, EW 4.

[3]  Mahadev Satyanarayanan,et al.  Coda: a highly available file system for a distributed workstation environment , 1989, Proceedings of the Second Workshop on Workstation Operating Systems.

[4]  Garret Swart,et al.  Granularity and semantic level of replication in the Echo distributed file system , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[5]  Kenneth P. Birman,et al.  Deceit: a flexible distributed file system , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[6]  Mahadev Satyanarayanan,et al.  Coda: A Highly Available File System for a Distributed Workstation Environment , 1990, IEEE Trans. Computers.

[7]  Darrell D. E. Long,et al.  A study of the reliability of Internet sites , 1991, [1991] Proceedings Tenth Symposium on Reliable Distributed Systems.

[8]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[9]  John S. Heidemann,et al.  Implementation of the Ficus Replicated File System , 1990, USENIX Summer.

[10]  Michael Stonebraker,et al.  Distributed RAID-a new multiple copy algorithm , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[11]  Kenneth P. Birman,et al.  Exploiting virtual synchrony in distributed systems , 1987, SOSP '87.