Cluster based Checkpointing Approach for Mobile Distributed Systems

Mobile Hosts (MHs) are increasingly becoming common in distributed systems due to their availability, cost, and mobile connectivity but large amount of checkpointed data are not stored on local MHs memory. The limited stable storage available in mobile environments can make traditional fault tolerance techniques unsuitable. Since storage on a mobile host is not considered stable, most protocols designed for these environments save the checkpoints on the base stations. Previous approaches have assumed that the base station always has sufficient space for storing checkpoints. If there is not enough storage available, checkpointing may need to be aborted. An adaptive fault tolerance protocol is described in this paper for manages storage effectively. In cluster based architecture, whole network is divided into several cells and each cell has a Base Station(BS). Cluster has more than one BS it. BSs are the nodes that are given the responsibility for routing the messages within the cell and performing the data aggregation. The communication between two adjacent cells are conducted through the Base Station.

[1]  Makoto Takizawa,et al.  Checkpoint-recovery protocol for reliable mobile systems , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[2]  Vaduvur Bharghavan,et al.  Routing in ad hoc networks using a spine , 1997, Proceedings of Sixth International Conference on Computer Communications and Networks.

[3]  P. Kumar,et al.  A non-intrusive minimum process synchronous checkpointing protocol for mobile distributed systems , 2005, 2005 IEEE International Conference on Personal Wireless Communications, 2005. ICPWC 2005..

[4]  Anthony Ephremides,et al.  The Design and Simulation of a Mobile Radio Network with Distributed Control , 1984, IEEE J. Sel. Areas Commun..

[5]  Parveen Kumar A low-cost hybrid coordinated checkpointing protocol for mobile distributed systems , 2008, Mob. Inf. Syst..

[6]  Mukesh Singhal,et al.  Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems , 2001, IEEE Trans. Parallel Distributed Syst..

[7]  Anthony Ephremides,et al.  The Architectural Organization of a Mobile Radio Network via a Distributed Algorithm , 1981, IEEE Trans. Commun..

[8]  Vaduvur Bharghavan,et al.  Routing in ad-hoc networks using minimum connected dominating sets , 1997, Proceedings of ICC'97 - International Conference on Communications.

[9]  Iman Saleh,et al.  In-network fault tolerance in networked sensor systems , 2006, DIWANS '06.

[10]  Shahram Rahimi,et al.  A New Roll-Forward Checkpointing / Recovery Mechanism for Cluster Federation , 2006 .

[11]  Prashant Kumar,et al.  A synchronous checkpointing protocol for mobile distributed systems: probabilistic approach , 2007, Int. J. Inf. Comput. Secur..

[12]  Mukesh Singhal,et al.  On the impossibility of min-process non-blocking checkpointing and an efficient checkpointing algorithm for mobile computing systems , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[13]  RICHARD KOO,et al.  Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.