An efficient recovery scheme for mobile computing environments

This paper presents an efficient recovery scheme based on checkpointing and message logging for mobile computing systems. For the efficient management of checkpoints and message logs, a movement-based scheme is proposed. Mobile hosts carrying their recovery information to the nearby mobile support station can recover instantly in case of a failure, however, the cost to transfer the recovery information must be high. On the other hand, the recovery information remaining dispersed over a number of support stations visited by mobile hosts must incur very high recovery cost. To balance the failure-free operation cost and the recovery cost, in the proposed scheme, the recovery information of a mobile host remains at the visited support stations while the host moves within a certain range. Only when the host moves out of the range, the recovery information is transferred to a nearby mobile support station. As a result, the proposed scheme can control the information transfer cost as well as the recovery cost.

[1]  Augusto Ciuffoletti,et al.  A Distributed Domino-Effect free recovery Algorithm , 1984, Symposium on Reliability in Distributed Software and Database Systems.

[2]  D. Manivannan,et al.  Failure Recovery based on Quasi-Synchronous Checkpointing in Mobile Computing Systems , 1996 .

[3]  Lorenzo Alvisi,et al.  Nonblocking and orphan-free message logging protocols , 1992, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[4]  Mukesh Singhal,et al.  Low-Cost Checkpointing and Failure Recovery in Mobile Computing Systems , 1996, IEEE Trans. Parallel Distributed Syst..

[5]  Tomasz Imielinski,et al.  Structuring distributed algorithms for mobile hosts , 1994, 14th International Conference on Distributed Computing Systems.

[6]  A. Prasad Sistla,et al.  Efficient distributed recovery using message logging , 1989, PODC '89.

[7]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[8]  W. Kent Fuchs,et al.  Message logging in mobile computing , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[9]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[10]  Vijay K. Garg,et al.  Distributed recovery with K-optimistic logging , 2003, J. Parallel Distributed Comput..

[11]  Ian F. Akyildiz,et al.  On location management for personal communications networks , 1996 .

[12]  Vijay K. Garg,et al.  How to recover efficiently and asynchronously when optimism fails , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[13]  Hon Fung Li,et al.  Optimal Checkpointing and Local Recording for Domino-Free Rollback Recovery , 1987, Inf. Process. Lett..

[14]  B. R. Badrinath,et al.  Checkpointing distributed applications on mobile computers , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[15]  RICHARD KOO,et al.  Checkpointing and Rollback-Recovery for Distributed Systems , 1986, IEEE Transactions on Software Engineering.

[16]  Heon Young Yeom,et al.  An asynchronous recovery scheme based on optimistic message logging for mobile computing systems , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[17]  Keqin Li,et al.  Optimal dynamic location update for PCS networks , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[18]  Willy Zwaenepoel,et al.  Manetho: Transparent Rollback-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit , 1992, IEEE Trans. Computers.

[19]  Junguk L. Kim,et al.  An Efficient Protocol for Checkpointing Recovery in Distributed Systems , 1993, IEEE Trans. Parallel Distributed Syst..

[20]  Nuno Neves,et al.  Adaptive recovery for mobile environments , 1997, CACM.

[21]  Lorenzo Alvisi,et al.  Message logging: pessimistic, optimistic, and causal , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[22]  W. Kent Fuchs,et al.  Lazy checkpoint coordination for bounding rollback propagation , 1992, Proceedings of 1993 IEEE 12th Symposium on Reliable Distributed Systems.

[23]  Dhiraj K. Pradhan,et al.  Recoverable mobile environment: design and trade-off analysis , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[24]  Mukesh Singhal,et al.  Low-cost checkpointing with mutable checkpoints in mobile computing systems , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).

[25]  Sean W. Smith,et al.  Completely asynchronous optimistic recovery with minimal rollbacks , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[26]  Yuval Tamir,et al.  ERROR RECOVERY IN MULTICOMPUTERS USING GLOBAL CHECKPOINTS , 1984 .

[27]  Brian Randell,et al.  Reliability Issues in Computing System Design , 1978, CSUR.

[28]  Bharat K. Bhargava,et al.  Independent checkpointing and concurrent rollback for recovery in distributed systems-an optimistic approach , 1988, Proceedings [1988] Seventh Symposium on Reliable Distributed Systems.