Fault-tolerant group membership protocols using physical robot messengers

In this paper, we consider a distributed system that consists of a group of teams of worker robots that rely on physical robot messengers for the communication between the teams. Unlike traditional distributed systems, there is a finite amount of messengers in the system, and thus a team can send messages to other teams only when some messenger robot is available locally. It follows that a careful management of the messengers is necessary to avoid the starvation of some teams. Concretely, the paper proposes algorithms to provide group membership and view synchrony among robot teams. We look at the problem in the face of failures, in particular when a certain number of messenger robots can possibly crash.

[1]  Huosheng Hu,et al.  Coordination of multiple mobile robots via communication , 1999, Other Conferences.

[2]  Edgar Nett,et al.  Managing dynamic groups of mobile systems , 2003, The Sixth International Symposium on Autonomous Decentralized Systems, 2003. ISADS 2003..

[3]  Robbert van Renesse,et al.  Reliable Distributed Computing with the Isis Toolkit , 1994 .

[4]  Mark G. Lewis,et al.  An Ad-hoc Network for Teams of Autonomous Vehicles , 2002 .

[5]  Idit Keidar,et al.  Moshe: A group membership service for WANs , 2002, TOCS.

[6]  Péter Urbán,et al.  Comparison of failure detectors and group membership: performance study of two atomic broadcast algorithms , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[7]  Danny Dolev,et al.  The Transis approach to high availability cluster communication , 1996, CACM.

[8]  Anees Shaikh,et al.  RTCAST: lightweight multicast for real-time process groups , 1996, Proceedings Real-Time Technology and Applications.

[9]  Rogério de Lemos,et al.  A robust group membership algorithm for distributed real-time systems , 1990, [1990] Proceedings 11th Real-Time Systems Symposium.

[10]  Louise E. Moser,et al.  Totem: a fault-tolerant multicast group communication system , 1996, CACM.

[11]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[12]  Idit Keidar,et al.  Group communication specifications: a comprehensive study , 2001, CSUR.

[13]  Naohiro Hayashibara,et al.  The φ Accrual Failure Detector , 2004 .