An Efficient Forward Recovery Checkpointing Scheme in Dissimilar Redundancy Computer System

Roll-Forward Checkpointing Schemes (RFCS) [1,2, 3, 4] are developed in order to avoid rollback in the presence of independent faults and increase the possibility that a task completes within a tight deadline. But the assumption of RFCS does not exist in most time. Run the same software on the same hardware may result in correlated faults. Another question is these RFCS schemes may lose useful build-in self detection information results in performance degradation. In this paper, we propose a Twice Dissimilar Redundancy Computer based Roll-Forward Recovery scheme (TDCS) that can avoid the correlated faults and realize fault-tolerance, without extra process. At last we use a novel technique based on a Markov Reward Model , to reveal our TDCS performance is quite better than the RFCS in average completion time when build-in self detection coverage be high. KeywordsRoll-forward checkpointing schemes;TDCS; build-in self detection ;Markov Reward Model

[1]  Dhiraj K. Pradhan,et al.  Roll-Forward Checkpointing Scheme: A Novel Fault-Tolerant Architecture , 1994, IEEE Trans. Computers.

[2]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[3]  Dhiraj K. Pradhan,et al.  Roll-forward and rollback recovery: performance-reliability trade-off , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[4]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[5]  S K Trivedi,et al.  The Analysis of Computer Systems Using Markov Reward Processes , 1987 .

[6]  Jehoshua Bruck,et al.  Analysis of checkpointing schemes for multiprocessor systems , 1994, Proceedings of IEEE 13th Symposium on Reliable Distributed Systems.

[7]  Jorge Bernardino,et al.  A Fault-Tolerant Mechanism for Simple Controllers , 1994, EDCC.