Fault-tolerance experiments with the JPL STAR computer.

Results of fault-tolerance experiments performed using an experimental computer with dynamic (standby) redundancy, including replaceable subsystems and a 'program rollback' provision to eliminate transient-caused errors. After a brief review of the specification of fault-tolerance with respect to transient faults, including a description of the method of injection of transient faults in software and system tests, fault-tolerance experiments carried out with this computer with regard to the determination of fault classes, software verification, system verification, and recovery stability are summarized. A test and repair processor is described which constitutes a special monitor unit of the computer and is used to obtain information for fault detection in the other subsystems of the computer and to ensure that proper recovery occurs when a fault is detected.