Design of redundant systems protected against common-mode failures

Redundancy techniques like duplication and Triple Modular Redundancy (TMR) are widely used for designing dependable systems to ensure high reliability and data integrity. In this paper, for the first time, we develop fault models for common-mode failures (CMFs) in redundant systems and describe techniques to design redundant systems protected against the modeled CMFs. We first develop an input-register-CMF model that targets systems with register-files. This paper shows that, in the presence of input-register-CMFs, we can always design duplex or TMR systems that either produce correct outputs or indicate error situations when incorrect outputs are produced. This property ensures data integrity. Next, we extend the input-register-CMF model to consider systems where the storage elements of the registers are not organized in register-files; instead, the register flip-flops are placed using conventional CAD programs. For this case, we present a technique to synthesize redundant systems with guaranteed data integrity against the extended input-register-CMFs.

[1]  Algirdas Avizienis,et al.  Fault Tolerance by Design Diversity: Concepts and Experiments , 1984, Computer.

[2]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[3]  Richard Edwin Stearns,et al.  Some Dangers in State Reduction of Sequential Machines , 1962, Inf. Control..

[4]  Robert S. Swarz,et al.  Reliable Computer Systems: Design and Evaluation , 1992 .

[5]  Jaynarayan H. Lala,et al.  Reducing the probability of common-mode failure in the fault tolerant parallel processor , 1993, [1993 Proceedings] AIAA/IEEE Digital Avionics Systems Conference.

[6]  Daniel G. Saab,et al.  Fault behavior dictionary for simulation of device-level transients , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[7]  Srinivas Devadas,et al.  Exact algorithms for output encoding, state assignment, and four-level Boolean minimization , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[8]  Jan Torin,et al.  On microprocessor error behavior modeling , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[9]  P. R. Stephan,et al.  SIS : A System for Sequential Circuit Synthesis , 1992 .

[10]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[11]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[12]  R. Reed,et al.  Heavy ion and proton-induced single event multiple upset , 1997 .

[13]  Johan Karlsson,et al.  On latching probability of particle induced transients in combinational networks , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[14]  J. von Neumann,et al.  Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .

[15]  Edward J. McCluskey,et al.  Word-voter: a new voter design for triple modular redundant systems , 2000, Proceedings 18th IEEE VLSI Test Symposium.

[16]  Srinivas Devadas,et al.  Boolean satisfiability and equivalence checking using general Binary Decision Diagrams , 1992, Integr..

[17]  J. H. Lala,et al.  Architectural principles for safety-critical real-time applications , 1994, Proc. IEEE.

[18]  Edward J. McCluskey,et al.  Common-mode failures in redundant VLSI systems: a survey , 2000, IEEE Trans. Reliab..

[19]  Edward J. McCluskey,et al.  Logic design principles - with emphasis on testable semicustom circuits , 1986, Prentice Hall series in computer engineering.

[20]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[21]  Santosh K. Shrivastava,et al.  Reliable Computer Systems , 1985, Texts and Monographs in Computer Science.

[22]  Charles E. Stroud Reliability of majority voting based VLSI fault-tolerant circuits , 1994, IEEE Trans. Very Large Scale Integr. Syst..