A Predictive Method for Providing Fault Tolerance in Multi-agent Systems

The growing importance of multi-agent applications and the need for a higher quality of service in these systems justify the increasing interest in fault-tolerant multi-agent systems. In this article, we propose an original method for providing dependability in multi- agent systems through replication. Our method is different from other works because our research focuses on building an automatic, adaptive and predictive replication policy where critical agents are replicated to avoid failures. This policy is determined by taking into account the criticality of the plans of the agents, which contain the collective and individual behaviors of the agents in the application. The set of replication strategies applied at a given moment to an agent is then fine-tuned gradually by the replication system so as to reflect the dynamicity of the multi-agent system. We report on experiments assessing the efficiency of our approach.

[1]  Joni da Silva Fraga,et al.  Adaptive Fault-Tolerant CORBA Components , 2003, Middleware Workshops.

[2]  Jørgen Lindskov Knudsen,et al.  Advances in Exception Handling Techniques , 2001, Lecture Notes in Computer Science.

[3]  Ravishankar K. Iyer,et al.  Chameleon: A Software Infrastructure for Adaptive Fault Tolerance , 1999, IEEE Trans. Parallel Distributed Syst..

[4]  Saurabh Bagchi,et al.  Chameleon: a software infrastructure for adaptive fault tolerance , 1998, Proceedings. IEEE International Computer Performance and Dependability Symposium. IPDS'98 (Cat. No.98TB100248).

[5]  Michael Golm,et al.  metaXa and the Future of Reflection , 1998 .

[6]  Pierre Sens,et al.  Towards Adaptive Fault-Tolerance For Distributed Multi-Agent Systems , 2001 .

[7]  Weiming Shen,et al.  A Hybrid Agent-Oriented Infrastructure for Modeling Manufacturing Enterprises , 1998 .

[8]  Pierre Sens,et al.  Dynamic and Adaptive Replication for Large-Scale Reliable Multi-agent Systems , 2002, SELMAS.

[9]  Rachid Guerraoui,et al.  Lessons from Designing and Implementing GARF , 1995, OBPDC.

[10]  Ralph Deters,et al.  Improving fault-tolerance by replicating agents , 2002, AAMAS '02.

[11]  Frederick S. Hillier,et al.  Introduction of Operations Research , 1967 .

[12]  A. Drogoul,et al.  Multi-Agent Simulation as a Tool for Modeling Societies: Application to Social Differentiation in Ant Colonies , 1992, MAAMAW.

[13]  Staffan Haegg,et al.  A Sentinel Approach to Fault Handling in Multi-Agent Systems , 1996, DAI.

[14]  Katia P. Sycara,et al.  Intelligent Adaptive Information Agents , 1997, Journal of Intelligent Information Systems.

[15]  Yann Chevaleyre,et al.  Recent Advances on Multi-agent Patrolling , 2004, SBIA.

[16]  Sarit Kraus,et al.  Probabilistically Survivable MASs , 2003, IJCAI.

[17]  Rachid Guerraoui,et al.  Software-Based Replication for Fault Tolerance , 1997, Computer.

[18]  William H. Sanders,et al.  AQuA: an adaptive architecture that provides dependable distributed objects , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).