论文信息 - TOFF-2: A high-performance fault-tolerant file service

TOFF-2: A high-performance fault-tolerant file service

Abstract TOFF-2 is a high-performance fault-tolerant file service featuring in a new Symmetric Primary Backup (SPB) replication model. This model lets all replicated servers in a service share the load of a traditional primary server, and minimizes the communication overhead between servers. TOFF-2 is also totally transparent to the client machines, and any host with an NFS client implementation can use the fault-tolerant service provided by TOFF-2 without any modification. The clients do not have to know anything about replication and server failures, since the TOFF-2 service cluster looks exactly like a single server over the network. The concept and design of TOFF-2 is introduced in this article, and statistics taken from tests on a prototype implementation show promising results.

Shang-Rong Tsai | Charles Changli Chin | S. Tsai | C. Chin

[1] Mendel Rosenblum,et al. The design and implementation of a log-structured file system , 1991, SOSP '91.

[2] Kenneth P. Birman,et al. Position Paper - Deceit: A Flexible Distributed File System , 1990, Workshop on the Management of Replicated Data.

[3] Michael Williams,et al. Replication in the harp file system , 1991, SOSP '91.

[4] Shang-Rong Tsai,et al. Transparency in a replicated network file system , 1996, Proceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies.

[5] Tzi-cker Chiueh. Trail: a track-based logging disk architecture for zero-overhead writes , 1993, Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93.

[6] Stephen E. Deering,et al. Host extensions for IP multicasting , 1986, RFC.

[7] John H. Hartman,et al. The Zebra striped network file system , 1995, TOCS.

[8] Dan Walsh,et al. Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.

[9] Anupam Bhide,et al. A Highly Available Network File Server , 1991, USENIX Winter.

[10] Kenneth P. Birman,et al. Deceit: a flexible distributed file system , 1990, [1990] Proceedings. Workshop on the Management of Replicated Data.

[11] Shang-Rong Tsai,et al. A fault tolerant RPC mechanism based on IP multicasting , 1997, J. Syst. Archit..

[12] Gagan Agrawal,et al. Coding-Based Replication Schemes for Distributed Systems , 1995, IEEE Trans. Parallel Distributed Syst..