Scheduling policies for fault tolerance in a VLSI processor

This paper presents analytical and simulation models for evaluating the operation of a VLSI processor (in a uniprocessor configuration) which utilizes a time-redundant approach (such as recomputation by shifted operands) for fault-tolerant computing. In the proposed approach, all incoming jobs to the uniprocessor are duplicated, thus two versions of each job must be processed. A discrepancy in the results produced by comparing the outcomes of the two versions of the same job indicates that a fault may have occurred. Several methods for appropriately scheduling the primary and secondary versions of the jobs are proposed and analyzed.

[1]  Janak H. Patel,et al.  Concurrent Error Detection in ALU's by Recomputing with Shifted Operands , 1982, IEEE Transactions on Computers.

[2]  Krishan K. Sabnani,et al.  Spare Capacity as a Means of Fault Detection and Diagnosis in Multiprocessor Systems , 1989, IEEE Trans. Computers.

[3]  E. E. Swartzlander,et al.  Time redundant error correcting adders and multipliers , 1992, Proceedings 1992 IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems.

[4]  Dhiraj K. Pradhan,et al.  Fault-tolerant computing : theory and techniques , 1986 .