This paper discusses different techniques for constructing a piece of self-checking software for systems where ultra-reliability is required. Self-checking software can be designed to detect software errors, to locate and to stop the propagation of software errors, to assist in the recovery from errors and to verify the integrity of the system. Self-checking techniques can be implemented in the program to check the function, the control sequence and the data of a process. The functional aspects of a process can be monitored to detect infinite loops, incorrect loop terminations, illegal branches and wrong branches. The validity of data of a process can be assured by performing checks on the integrity of data values, the integrity of data structures and the nature of data values. These self-checking capabilities should be implemented during the initial stage of program development. The cost-effectiveness of each technique in the particular operating environment should be evaluated. Only the most cost-effective techniques should be retained and overhead can be reduced considerably by implementing these techniques in hardware.
[1]
Brian Randell,et al.
Operating Systems: The Problems of Performance and Reliability
,
1971,
IFIP Congress.
[2]
Richard W Watson.
Timesharing system design concepts (McGraw-Hill computer science series)
,
1970
.
[3]
Stephen S. Yau,et al.
Concurrent software fault detection
,
1975,
IEEE Transactions on Software Engineering.
[4]
Robert S. Fabry.
Dynamic verification of operating system decisions
,
1973,
CACM.
[5]
C. V. Ramamoorthy,et al.
Reliability and Integrity of Large Computer Programs
,
1974,
Fachtagung Prozessrechner.