Hardware and software fault tolerance using fail-silent virtual duplex systems

Safety-critical systems must detect and tolerate hardware and software faults. The multiple virtual duplex system, the new scheme we propose for application in distributed control systems, efficiently covers both objectives. It comprises design and systematic diversity, time redundancy and a minimal amount of nodes. As a building block we use the virtual duplex system, which executes diverse variants of the software sequentially on a single node. For large control systems we offer two protocol types: the communication overhead can be kept low by a simple protocol, or can be slightly increased to enable a pipeline, leading to a drastic reduction in the required time.

[1]  Janak H. Patel,et al.  Concurrent Error Detection in ALU's by Recomputing with Shifted Operands , 1982, IEEE Transactions on Computers.

[2]  Gernot Metze,et al.  Fault Detection Capabilities of Alternating Logic , 1978, IEEE Transactions on Computers.

[3]  John J. Shedletsky,et al.  Error Correction by Alternate-Data Retry , 1978, IEEE Transactions on Computers.

[4]  Daniel P. Siewiorek,et al.  Effects of transient gate-level faults on program behavior , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.

[5]  Jie Xu The t(n-1)-diagnosability and its applications to fault tolerance , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[6]  Klaus Echtle Fault Tolerance based on Time-Staggered Redundancy , 1987, Fehlertolerierende Rechensysteme.

[7]  Jaynarayan H. Lala,et al.  Hardware and software fault tolerance: a unified architectural approach , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[8]  K. H. Kim,et al.  Distributed Execution of Recovery Blocks: An Approach for Uniform Treatment of Hardware and Software Faults in Real-Time Applications , 1989, IEEE Trans. Computers.

[9]  Tomislav Lovric Systematic and Design Diversity - Software Techniques for Hardware Fault Detection , 1994, EDCC.