Near-optimal continuous patrolling with teams of mobile information gathering agents

Autonomous unmanned vehicles equipped with sensors are rapidly becoming the de facto means of achieving situational awareness - the ability to make sense of, and predict what is happening in an environment. Particularly in environments that are subject to continuous change, the use of such teams to maintain accurate and up-to-date situational awareness is a challenging problem. To perform well, the vehicles need to patrol their environment continuously and in a coordinated manner. To address this challenge, we develop a near-optimal multi-agent algorithm for continuously patrolling such environments. We first define a general class of multi-agent information gathering problems in which vehicles are represented by information gathering agents - autonomous entities that direct their activity towards collecting information with the aim of providing accurate and up-to-date situational awareness. These agents move on a graph, while taking measurements with the aim of maximising the cumulative discounted observation value over time. Here, observation value is an abstract measure of reward, which encodes the properties of the agents@? sensors, and the spatial and temporal properties of the measured phenomena. Concrete instantiations of this class of problems include monitoring environmental phenomena (temperature, pressure, etc.), disaster response, and patrolling environments to prevent intrusions from (non-strategic) attackers. In more detail, we derive a single-agent divide and conquer algorithm to compute a continuous patrol (an infinitely long path in the graph) that yields a near-optimal amount of observation value. This algorithm recursively decomposes the graph, until high-quality paths in the resulting components can be computed outright by a greedy algorithm. It then constructs a patrol by concatenating these paths using dynamic programming. For multiple agents, the algorithm sequentially computes patrols for each agent in a greedy fashion, in order to maximise its marginal contribution to the team. Moreover, to achieve robustness, we develop algorithms for repairing patrols when one or more agents fail or the graph changes. For both the single- and the multi-agent case, we give theoretical guarantees (lower bounds on the solution quality and an upper bound on the computational complexity in the size of the graph and the number agents) on the performance of the algorithms. We benchmark the single- and multi-agent algorithm against the state of the art and demonstrate that it typically performs 35% and 33% better in terms of average and minimum solution quality respectively.

[1]  Peter Stone,et al.  A multi-robot system for continuous area sweeping tasks , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[2]  Stephen Fitzpatrick,et al.  Distributed Coordination through Anarchic Optimization , 2003 .

[3]  Nicola Basilico,et al.  Leader-follower strategies for robotic patrolling in environments with arbitrary topologies , 2009, AAMAS.

[4]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[5]  Milind Tambe,et al.  Urban Security: Game-Theoretic Resource Allocation in Networked Domains , 2010, AAAI.

[6]  Nicholas R. Jennings,et al.  A Decentralised Coordination Algorithm for Mobile Sensors , 2010, AAAI.

[7]  Nicholas R. Jennings,et al.  Bounded approximate decentralised coordination via the max-sum algorithm , 2009, Artif. Intell..

[8]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[9]  Richard M. Karp,et al.  Dynamic programming meets the principle of inclusion and exclusion , 1982, Oper. Res. Lett..

[10]  Nicholas R. Jennings,et al.  Deploying the max-sum algorithm for decentralised coordination and task allocation of unmanned aerial vehicles for live aerial imagery collection , 2012, 2012 IEEE International Conference on Robotics and Automation.

[11]  Nando de Freitas,et al.  Active Policy Learning for Robot Planning and Exploration under Uncertainty , 2007, Robotics: Science and Systems.

[12]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[13]  Nicos Christofides Worst-Case Analysis of a New Heuristic for the Travelling Salesman Problem , 1976, Operations Research Forum.

[14]  Aníbal Ollero,et al.  Experimental Results in Multi-UAV Coordination for Disaster Management and Civil Security Applications , 2011, J. Intell. Robotic Syst..

[15]  Ben Grocholsky,et al.  Information-Theoretic Control of Multiple Sensor Platforms , 2002 .

[16]  Andreas Krause,et al.  Efficient Planning of Informative Paths for Multiple Robots , 2006, IJCAI.

[17]  Arunabha Sen,et al.  Graph Clustering Using Distance-k Cliques , 1999, GD.

[18]  Nicholas R. Jennings,et al.  A Decentralised Coordination Algorithm for Maximising Sensor Coverage in Large Sensor Networks , 2010, AAMAS 2010.

[19]  Andreas Krause,et al.  Nonmyopic Informative Path Planning in Spatio-Temporal Models , 2007, AAAI.

[20]  Nicola Basilico,et al.  Automated Abstractions for Patrolling Security Games , 2011, AAAI.

[21]  Nicholas R. Jennings,et al.  Decentralised Coordination of Mobile Sensors Using the Max-Sum Algorithm , 2009, IJCAI.

[22]  Srinivasan Parthasarathy,et al.  Symmetrizations for clustering directed graphs , 2011, EDBT/ICDT '11.

[23]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[24]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[25]  Jacques Wainer,et al.  Probabilistic Multiagent Patrolling , 2008, SBIA.

[26]  Leslie Pack Kaelbling,et al.  On the Complexity of Solving Markov Decision Problems , 1995, UAI.

[27]  C. Guestrin,et al.  Near-optimal sensor placements: maximizing information while minimizing communication cost , 2006, 2006 5th International Conference on Information Processing in Sensor Networks.

[28]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[29]  David M. Fratantoni,et al.  Multi-AUV Control and Adaptive Sampling in Monterey Bay , 2006, IEEE Journal of Oceanic Engineering.

[30]  Sarit Kraus,et al.  Multi-robot perimeter patrol in adversarial settings , 2008, 2008 IEEE International Conference on Robotics and Automation.

[31]  Noa Agmon,et al.  Multi-robot area patrol under frequency constraints , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[32]  Maurice Queyranne,et al.  An Exact Algorithm for Maximum Entropy Sampling , 1995, Oper. Res..

[33]  Bernard Harris,et al.  Graph theory and its applications , 1970 .

[34]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[35]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[36]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[37]  Marina Meila,et al.  Clustering by weighted cuts in directed graphs , 2007, SDM.

[38]  Andreas Krause,et al.  Efficient Informative Sensing using Multiple Robots , 2014, J. Artif. Intell. Res..

[39]  Enar Reilent,et al.  RFID-based Communications for a Self-Organising Robot Swarm , 2008, 2008 Second IEEE International Conference on Self-Adaptive and Self-Organizing Systems.

[40]  Andrew W. Moore,et al.  Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.

[41]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[42]  M. Puterman,et al.  Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .

[43]  Milind Tambe,et al.  Urban security: game-theoretic resource allocation in networked physical domains , 2010, AAAI 2010.

[44]  Shlomo Moran,et al.  On the length of optimal TSP circuits in sets of bounded diameter , 1984, J. Comb. Theory, Ser. B.

[45]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[46]  Ronald L. Rivest,et al.  Introduction to Algorithms, 3rd Edition , 2009 .

[47]  Tuomas Sandholm,et al.  Lossy stochastic game abstraction with bounds , 2012, EC '12.

[48]  Andreas Krause,et al.  Near-optimal sensor placements in Gaussian processes , 2005, ICML.

[49]  S. Shankar Sastry,et al.  Pursuit-evasion games with unmanned ground and aerial vehicles , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[50]  V. Climenhaga Markov chains and mixing times , 2013 .

[51]  Sarit Kraus,et al.  An efficient heuristic approach for security against multiple adversaries , 2007, AAMAS '07.

[52]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[53]  Tim Roughgarden,et al.  Algorithmic Game Theory , 2007 .