The fault-tolerant architecture of the safe system

Abstract This paper presents a fault-tolerant architecture for industrial control applications aiming at achieving fault-tolerance with low redundancy rates by means of new error detection mechanisms, efficient reconfiguration schemes, and a cost effective use of dynamic redundancy techniques. The paper also stresses some novel aspects of this architecture such as a new method for fast stable-storage implementation and the use of a new approach to concurrent system-level error detection based on signature monitoring. This new error detection scheme has the potential to detect a larger kind of errors than traditional signature monitoring approaches. Furthermore, this new approach does not require special assemblers and loaders as is the case with the already existing techniques.

[1]  John Paul Shen,et al.  Processor Control Flow Monitoring Using Signatured Instruction Streams , 1987, IEEE Transactions on Computers.

[2]  John Paul Shen A roving monitoring processor for detection of control flow errors in multiple processor systems , 1987 .

[3]  Satish M. Thatte,et al.  Concurrent Checking of Program Flow in VLSI Processors , 1982, ITC.

[4]  David A. Rennels,et al.  Fault-Tolerant Computing—Concepts and Examples , 1984, IEEE Transactions on Computers.

[5]  Theodore J. Williams The Development of Reliabililty in Industrial Control Systems , 1984, IEEE Micro.

[6]  Edward J. McCluskey,et al.  Concurrent Fault Detection Using a Watchdog Processor and Assertions , 1983, ITC.

[7]  David J. Lu Watchdog Processors and Structural Integrity Checking , 1982, IEEE Transactions on Computers.

[8]  Brian Randell System structure for software fault tolerance , 1975 .

[9]  John Paul Shen,et al.  Continuous signature monitoring: efficient concurrent-detection of processor control errors , 1988, International Test Conference 1988 Proceeding@m_New Frontiers in Testing.

[10]  Hubert D. Kirrmann Fault Tolerance in Process Control: An Overview And Examples of European Products , 1987, IEEE Micro.

[11]  Edward J. McCluskey,et al.  Concurrent System-Level Error Detection Using a Watchdog Processor , 1985, ITC.

[12]  John P. Robinson,et al.  On Concurrently Testable Microprogrammed Control Units , 1986, ITC.

[13]  David A. Rennels Distributed Fault-Tolerant Computer Systems , 1980, Computer.

[14]  Edward J. McCluskey,et al.  Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.

[15]  Ernst J. Schmitter,et al.  The Basic Fault-tolerant System , 1984, IEEE Micro.

[16]  Janusz Sosnowski,et al.  Detection of control flow errors using signature and checking instructions , 1988, International Test Conference 1988 Proceeding@m_New Frontiers in Testing.

[17]  John Paul Shen,et al.  On-Line Self-Monitoring Using Signatured Instruction Streams , 1983, International Test Conference.

[18]  D. B. Lomet Process structuring, synchronization, and recovery using atomic actions , 1977 .