Microprocessor based systems for controlling gas supplies require very high levels of reliability for safety reasons. Non-redundant systems are considered to be inadequate, and an alternative approach is necessary. in digital systems, transient faults are as much as fifty times more common than permanent faults. Therefore mechanisms which allow for recovery from transients will provide large Improvements in reliability. However, to enable effective design of recovery mechanisms it Is necessary to understand failure modes. The results from practical interference tests, designed to simulate transient faults, are presented. They show that corruption to the correct flow of program execution is a common failure, and that subsequent instruction fetches can be performed from any of the memory locations. Under these conditions any value of operation code can be Interpreted as an instruction, including those undeclared by the manufacturers. Four commonly used microprocessors are investigated to establish the functions of the undeclared codes, and other undeclared operations are revealed. Analyses on the sequence of events following a random jump into the four main memory types of data, program, unused and input areas, are presented. Recovery from this type of execution can be achieved by the addition of restart codes into the areas, so that execution can transfer to a recovery routine. The effect of this mechanism on the recovery process is investigated. Finally, some methods of testing systems, to check the levels of reliability improvement obtained by these techniques, are considered.
[1]
T. May,et al.
A New Physical Mechanism for Soft Errors in Dynamic Memories
,
1978,
16th International Reliability Physics Symposium.
[2]
Hermann Kopetz.
Software Reliability
,
1979
.
[3]
Eugene R. Hnatek.
Microprocessor device reliability
,
1977,
Microprocess..
[4]
P R Kurzhals,et al.
Integrity in flight control systems
,
1977
.
[5]
Daniel P. Siewiorek,et al.
Reliability and Performance of Error-Correcting Memory and Register Arrays
,
1980,
IEEE Transactions on Computers.
[6]
Dhiraj K. Pradhan,et al.
Undetectability of Bridging Faults and Validity of Stuck-At Fault Test Sets
,
1980,
IEEE Transactions on Computers.
[7]
L. L. Kinney,et al.
COMPARISON OF ALTERNATIVE SELF CHECK TECHNIQUES IN SEMICONDUCTOR MEMORIES.
,
1977
.