The Delta-4 extra performance architecture (XPA)

The design of an extra performance architecture for Delta-4, which explicitly supports the requirements of real-time systems with respect to throughput and response, is presented. The Delta-4 approach to fault tolerance is based on the replication of software components on distinct host computers using a range of different replication strategies. The problems of replicate divergence are discussed, and a solution based on message selection and preemption synchronization messages is proposed. A description of the ongoing implementation of such a system within the overall Delta-4 framework is included.<<ETX>>

[1]  Peter A. Barrett,et al.  Using passive replicates in Delta-4 to provide dependable distributed computing , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[2]  Jean-Charles Fabre,et al.  Distributed coupled actors: A Chorus proposal for reliability , 1982, ICDCS.

[3]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[4]  Paulo Veríssimo,et al.  The Delta-4 approach to dependability in open distributed computing systems , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[5]  Andrew Birrell,et al.  Implementing remote procedure calls , 1984, TOCS.

[6]  Anita Borg,et al.  A message system supporting fault tolerance , 1983, SOSP '83.

[7]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.

[8]  Santosh K. Shrivastava,et al.  Preventing state divergence in replicated distributed programs , 1990, Proceedings Ninth Symposium on Reliable Distributed Systems.

[9]  Paulo Veríssimo,et al.  AMp: a highly parallel atomic multicast protocol , 1989, SIGCOMM 1989.

[10]  Paul D. Ezhilchelvan,et al.  Fail-controlled processor architectures for distributed systems , 1990 .