Distributed Watchpoints: Debugging Large Multi-Robot Systems

Tightly-coupled multi-agent systems such as modular robots frequently exhibit properties of interest that span multiple modules. These properties cannot easily be detected from any single module, though they might readily be detected by combining the knowledge of multiple modules. Testing for distributed conditions is especially important in debugging or verifying the correctness of software for modular robots. We have developed a technique we call distributed watchpoint triggers which can efficiently recognize such distributed conditions. Our watchpoint description language can handle a variety of temporal, spatial, and logical properties spanning multiple robots. This paper presents that language, describes our fully-distributed, online mechanism for detecting distributed conditions in a running system, and evaluates the performance of our implementation. We found that the performance of the system is highly dependent on the program being debugged, scales linearly with ensemble size, and is small enough to make the system practical in all but the worst case scenarios

[1]  Steve Carr,et al.  Race conditions: a case study , 2001 .

[2]  Zhonghua Yang,et al.  Global snapshots for distributed debugging , 1992, Proceedings ICCI `92: Fourth International Conference on Computing and Information.

[3]  Atul Singh,et al.  Using queries for distributed monitoring and forensics , 2006, EuroSys.

[4]  W. McCarthy Programmable matter , 2000, Nature.

[5]  Michel Raynal,et al.  On the Fly Testing of Regular Patterns in Distributed Computations , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[6]  Nicholas Nethercote,et al.  Valgrind: A Program Supervision Framework , 2003, RV@CAV.

[7]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[8]  Fred Kröger,et al.  Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.

[9]  Vijay K. Garg,et al.  Detection of global predicates: Techniques and their limitations , 1998, Distributed Computing.

[10]  Mukesh Singhal,et al.  Efficient Distributed Detection of Conjunctions of Local Predicates , 1998, IEEE Trans. Software Eng..

[11]  D. Rosa Distributed Watchpoints : Debugging Very Large Ensembles of Robots ( Extended Abstract ) , 2006 .

[12]  Ion Stoica,et al.  Implementing declarative overlays , 2005, SOSP '05.

[13]  Leslie Lamport,et al.  Distributed snapshots: determining global states of distributed systems , 1985, TOCS.

[14]  Froduald Kabanza,et al.  Reasoning about Robot Actions: A Model Checking Approach , 2001, Advances in Plan-Based Control of Robotic Agents.