Lessons from FTM: An Experiment in Design and Implementation of a Low-Cost Fault-Tolerant System
暂无分享,去创建一个
Michel Banâtre | Gilles Muller | Nadine Peyrouze | Bruno Rochat | Gilles Muller | M. Banâtre | B. Rochat | Nadine Peyrouze
[1] B. Randell,et al. STATE RESTORATION IN DISTRIBUTED SYSTEMS , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..
[2] Takashi Masuda,et al. Designing an Extensible Distributed Language with a Meta-Level Architecture , 1993, ECOOP.
[3] RICHARD KOO,et al. Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.
[4] S. Venkatesan,et al. Crash recovery with little overhead , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.
[5] Michel Raynal,et al. Synchronization and control of distributed systems and programs , 1990, Wiley series in parallel computing.
[6] Robert E. Strom,et al. Optimistic recovery in distributed systems , 1985, TOCS.
[7] Luke Lin,et al. Using checkpoints to localize the effects of faults in distributed systems , 1989, Proceedings of the Eighth Symposium on Reliable Distributed Systems.
[8] Gilles Muller,et al. A stable transactional memory for building robust object oriented programs , 1991 .
[9] Bharat K. Bhargava,et al. Independent checkpointing and concurrent rollback for recovery in distributed systems-an optimistic approach , 1988, Proceedings [1988] Seventh Symposium on Reliable Distributed Systems.
[10] Wolfgang Graetsch,et al. Fault tolerance under UNIX , 1989, TOCS.
[11] Michel Banâtre,et al. Design decisions for the FTM: a general purpose fault tolerant machine , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.
[12] Guy Lapalme,et al. The design and building of Enchère, a distributed electronic marketing system , 1986, CACM.
[13] Flaviu Cristian,et al. A timestamp-based checkpointing protocol for long-lived distributed computations , 1991, [1991] Proceedings Tenth Symposium on Reliable Distributed Systems.
[14] Michel Banâtre,et al. How to Design Reliable Servers using Fault Tolerant Micro-Kernel Mechanisms , 1991, USENIX MACH Symposium.
[15] Luís Moura Silva,et al. Global checkpointing for distributed programs , 1992, [1992] Proceedings 11th Symposium on Reliable Distributed Systems.
[16] Mark Cameron Little,et al. Object replication in a distributed system , 1991 .
[17] Luke Lin,et al. Checkpointing and rollback-recovery in distributed object based systems , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.
[18] Bharat K. Bhargava,et al. A model for concurrent checkpointing and recovery using transactions , 1989, [1989] Proceedings. The 9th International Conference on Distributed Computing Systems.
[19] Barry J. Gleeson,et al. Fault Tolerance: Why Should I Pay for It? , 1994, Hardware and Software Architectures for Fault Tolerance.
[20] Lily B. Mummert,et al. Camelot and Avalon: A Distributed Transaction Facility , 1991 .
[21] Santosh K. Shrivastava,et al. Exploiting Type Inheritance Facilities to Implement Recoverability in Object Based Systems , 1987, SRDS.
[22] Mark Russinovich,et al. Application transparent fault management in fault tolerant Mach , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.
[23] Michel Banâtre,et al. Ensuring data security and integrity with a fast stable storage , 1988, Proceedings. Fourth International Conference on Data Engineering.
[24] Anoop Gupta,et al. SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.
[25] Thomas Anderson,et al. Fault Tolerant Systems , 1990 .
[26] Reinhold Kröger,et al. Recovery-management in the RelaX distributed transaction layer , 1989, Proceedings of the Eighth Symposium on Reliable Distributed Systems.
[27] Frank B. Schmuck,et al. Experience with transactions in QuickSilver , 1991, SOSP '91.
[28] Robbert van Renesse,et al. Amoeba A Distributed Operating System for the 1990 s Sape , 1990 .
[29] Roger L. Haskin,et al. Recovery management in QuickSilver , 1988, TOCS.
[30] Bruno Rochat. Une approche a la construction de services fiables dans les systemes distribues , 1992 .
[31] Butler W. Lampson,et al. Atomic Transactions , 1980, Advanced Course: Distributed Systems.
[32] Leslie Lamport,et al. Distributed snapshots: determining global states of distributed systems , 1985, TOCS.
[33] Jeffrey F. Naughton,et al. Checkpointing multicomputer applications , 1991, [1991] Proceedings Tenth Symposium on Reliable Distributed Systems.
[34] Brian Randell,et al. Designing Secure and Reliable Applications using Fragmentation-Redundancy-Scattering: An Object-Oriented Approach , 1994, EDCC.
[35] Willy Zwaenepoel,et al. Manetho: Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit , 1992, IEEE Trans. Computers.
[36] D. Jewett,et al. Integrity S2: A Fault-Tolerant Unix Platform , 1991, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..
[37] Pankaj Jalote,et al. Fault tolerance in distributed systems , 1994 .
[38] Bruce Jay Nelson. Remote procedure call , 1981 .
[39] Barbara Liskov,et al. Implementation of Argus , 1987, SOSP '87.
[40] Yuval Tamir,et al. ERROR RECOVERY IN MULTICOMPUTERS USING GLOBAL CHECKPOINTS , 1984 .
[41] Gilles Muller,et al. Performance of Consistent Checkpointing in a Modular Operating System: Results of the FTM Experiment , 1994, EDCC.
[42] Arthur P. Goldberg. Transparent Recovery of Mach Applications , 1990, USENIX MACH Symposium.
[43] William J. Bolosky,et al. Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.
[44] Michel Banâtre,et al. An experience in the design of a reliable object based system , 1993, [1993] Proceedings of the Second International Conference on Parallel and Distributed Information Systems.
[45] Bharat K. Bhargava,et al. Concurrent robust checkpointing and recovery in distributed systems , 1988, Proceedings. Fourth International Conference on Data Engineering.
[46] Willy Zwaenepoel,et al. The performance of consistent checkpointing , 1992, [1992] Proceedings 11th Symposium on Reliable Distributed Systems.
[47] James P. Black,et al. Redundancy in Data Structures: Improving Software Fault Tolerance , 1980, IEEE Transactions on Software Engineering.