Continuous signature monitoring: low-cost concurrent detection of processor control errors

A low-cost approach to concurrent detection of processor control errors is presented that uses a simple hardware monitor and signatures embedded into the executing program. Existing signature-monitoring techniques detect a large portion of processor control errors at a fraction of the cost of duplication. Analytical methods developed in this study show that the new approach, continuous signature monitoring (CSM), makes major advances beyond existing techniques. CSM reduces the fraction of undetected control-flow errors by orders of magnitude, to less than 10/sup -6/, while the number of signatures reaches a theoretical minimum, being lowered by as much as three times to a range of 4-11%. Signature cost is reduced by placing CSM signatures at locations that minimize performance loss and (for some architectures) memory overhead. CSM exploits the program memory's SEC/DED code to decrease error-detection latency by as much as 1000 times, to 0.016 program memory cycles, without increasing memory overhead. This short latency allows transient faults to be tolerated. >

[1]  MAKOTO KOBAYASHI Dynamic Profile of Instruction Sequences for the IBM System/370 , 1983, IEEE Transactions on Computers.

[2]  Edward J. McCluskey,et al.  Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.

[3]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[4]  M. Y. Hsiao,et al.  A class of optimal minimum odd-weight-column SEC-DED codes , 1970 .

[5]  John Paul Shen,et al.  Processor Control Flow Monitoring Using Signatured Instruction Streams , 1987, IEEE Transactions on Computers.

[6]  Henry M. Levy,et al.  An evaluation of branch architectures , 1987, ISCA '87.

[7]  John Paul Shen,et al.  On-Line Self-Monitoring Using Signatured Instruction Streams , 1983, International Test Conference.

[8]  Shu Lin,et al.  An introduction to error-correcting codes , 1970 .

[9]  John Paul Shen A roving monitoring processor for detection of control flow errors in multiple processor systems , 1987 .

[10]  John P. Robinson,et al.  On Concurrently Testable Microprogrammed Control Units , 1986, ITC.

[11]  Wilfried Daehn,et al.  Bounds and analysis of aliasing errors in linear feedback shift registers , 1988, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[12]  K. Iwasaki Analysis and proposal of signature circuits for LSI testing , 1988, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  H. Hecht,et al.  Designing micro-based systems for fail-safe travel: For reliable control of railroads, aircraft, and space vehicles, designers are harnessing the power of the microprocessor , 1987, IEEE Spectrum.

[14]  David B. Wortman,et al.  Static and Dynamic Characteristics of XPL Programs , 1975, Computer.

[15]  Satish M. Thatte,et al.  Concurrent Checking of Program Flow in VLSI Processors , 1982, ITC.

[16]  John Paul Shen,et al.  Continuous signature monitoring: efficient concurrent-detection of processor control errors , 1988, International Test Conference 1988 Proceeding@m_New Frontiers in Testing.

[17]  Algirdas Avizienis,et al.  A fault tolerance approach to computer viruses , 1988, Proceedings. 1988 IEEE Symposium on Security and Privacy.

[18]  Janusz Sosnowski,et al.  Detection of control flow errors using signature and checking instructions , 1988, International Test Conference 1988 Proceeding@m_New Frontiers in Testing.

[19]  Mark Horowitz,et al.  Architectural tradeoffs in the design of MIPS-X , 1987, ISCA '87.

[20]  John Paul Shen,et al.  Concurrent Error Detection using Signature Monitoring and Encryption , 1991 .

[21]  David J. Lu Watchdog Processors and Structural Integrity Checking , 1982, IEEE Transactions on Computers.

[22]  Edward J. McCluskey,et al.  Concurrent Fault Detection Using a Watchdog Processor and Assertions , 1983, ITC.

[23]  Kostas N. Oikonomou,et al.  Abstractions for Node Level Passive Fault Detection in Distributed Systems , 1983, IEEE Transactions on Computers.