Checkpoint Time Arrangement Rotation in Hybrid State Saving with a Limited Number of Periodical Checkpoints

This paper discusses hybrid state saving for applications in which processes should create checkpoints at constant intervals and can hold a finite number of checkpoints. We propose a reclamation technique for checkpoint space, that provides effective checkpoint time arrangements for a rollback distance distribution. Numerical examples show that when we cannot use the optimal checkpoint interval due to the system requirements, the proposed technique can achieve lower expected overhead compared to the conventional technique without considering the form of the rollback distance distribution. key words: distributed checkpointing, hybrid state saving, checkpointspace reclamation, time arrangement rotation

[1]  Tadashi Dohi,et al.  Distribution-free checkpoint placement algorithms based on min-max principle , 2006, IEEE Transactions on Dependable and Secure Computing.

[2]  L. Alvisi,et al.  A Survey of Rollback-Recovery Protocols , 2002 .

[3]  Yi-Bing Lin,et al.  Selecting the checkpoint interval in time warp simulation , 1993, PADS '93.

[4]  W. Kent Fuchs,et al.  Checkpoint Space Reclamation for Uncoordinated Checkpointing in Message-Passing Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[5]  Adel Said Elmaghraby,et al.  An Analytical Model for Hybrid Checkpointing in Time Warp Distributed Simulation , 1998, IEEE Trans. Parallel Distributed Syst..

[6]  Nitin H. Vaidya,et al.  Staggered Consistent Checkpointing , 1999, IEEE Trans. Parallel Distributed Syst..