Lazy Agent Replication and Asynchronous Consensus for the Fault-Tolerant Mobile Agent System

In this paper, we propose a low overhead replication scheme for the fault-tolerant mobile agent system. In the proposed lazy replication scheme, execution of a primary agent and migration of its replicas are concurrently processed. Also, the primary agent performs asynchronous consensus with fixed consensus agents so that the consensus step and the replica migration step can concurrently be processed. As a result, the primary agent should not wait for the completion of the replica migration step unless any of the consensus agents fails. The proposed scheme has been implemented on top of the Aglet system and its performance has been measured.

[1]  Taesoon Park,et al.  Low Overhead Agent Replication for the Reliable Mobile Agent System , 2003, Euro-Par.

[2]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[3]  Michael B. Dillencourt,et al.  An application-transparent, platform-independent approach to rollback-recovery for mobile agent systems , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[4]  Markus Straßer,et al.  System mechanisms for partial rollback of mobile agent execution , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.

[5]  Fritz Hohl,et al.  Mole – Concepts of a mobile agent system , 1999, World Wide Web.

[6]  Fred B. Schneider,et al.  NAP: practical fault-tolerance for itinerant computations , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[7]  André Schiper,et al.  FATOMAS-a fault-tolerant mobile agent system based on the agent-dependent approach , 2001, 2001 International Conference on Dependable Systems and Networks.

[8]  Markus Straßer,et al.  Reliability Concepts for Mobile Agents , 1998, Int. J. Cooperative Inf. Syst..

[9]  Luís Moura Silva,et al.  Fault-tolerant execution of mobile agents , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[10]  Heon Young Yeom,et al.  The performance of checkpointing and replication schemes for fault tolerant mobile agent systems , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[11]  André Schiper,et al.  Modeling fault-tolerant mobile agent execution as a sequence of agreement problems , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[12]  Danny B. Lange,et al.  A Security Model for Aglets , 1997, IEEE Internet Comput..