Implementing design diversity to achieve fault tolerance

The software faults that are particularly significant in a real-time concurrent system are identified, and the use of design diversity to prevent their occurrence is examined. Two approaches to enforced diversity, recovery-block software and multiversion software, are discussed. The recovery-block scheme combines N diverse software versions arranged (conceptually, at least) in sequential order, although the versions may also be organized to execute concurrently. The multiversion-software approach excuses all N versions in parallel, taking advantage of the redundant processors likely to be available in any system that must tolerate hardware and software faults. Although different, both approaches require sufficiently diverse development environments and that faults in the specification do not lead to similar errors.<<ETX>>

[1]  John C. Knight,et al.  A Framework for Software Fault Tolerance in Real-Time Systems , 1983, IEEE Transactions on Software Engineering.

[2]  John P. J. Kelly,et al.  Achieving Dependability Throughout the Development Process: A Distributed Software Experiment , 1990, IEEE Trans. Software Eng..

[3]  Paul Ammann,et al.  On the performance of software testing using multiple versions , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.

[4]  Thomas I. McVittie,et al.  Techniques for building dependable distributed systems: multi-version software testing , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.

[5]  Algirdas Avizienis,et al.  Fault Tolerance by Design Diversity: Concepts and Experiments , 1984, Computer.

[6]  H. Hecht,et al.  Fault-Tolerant Software for Real-Time Applications , 1976, CSUR.

[7]  Brian Randell System Structure for Software Fault Tolerance , 1975, IEEE Trans. Software Eng..

[8]  Bev Littlewood,et al.  Conceptual Modeling of Coincident Failures in Multiversion Software , 1989, IEEE Trans. Software Eng..

[9]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[10]  Peter A. Barrett,et al.  Software Fault Tolerance: An Evaluation , 1985, IEEE Transactions on Software Engineering.

[11]  Brian Randell System structure for software fault tolerance , 1975 .

[12]  Dave E. Eckhardt,et al.  A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors , 1985, IEEE Transactions on Software Engineering.