Game-theoretic learning algorithm for a spatial coverage problem

In this paper we consider a class of dynamic vehicle routing problems, in which a number of mobile agents in the plane must visit target points generated over time by a stochastic process. It is desired to design motion coordination strategies in order to minimize the expected time between the appearance of a target point and the time it is visited by one of the agents. We cast the problem as a spatial game in which each agent's objective is to maximize the expected value of the “time spent alone” at the next target location and show that the Nash equilibria of the game correspond to the desired agent configurations. We propose learning-based control strategies that, while making minimal or no assumptions on communications between agents as well as the underlying distribution, provide the same level of steady-state performance achieved by the best known decentralized strategies.

[1]  Christos G. Cassandras,et al.  A Cooperative receding horizon controller for multivehicle uncertain environments , 2006, IEEE Transactions on Automatic Control.

[2]  Kagan Tumer,et al.  Collectives and Design Complex Systems , 2004 .

[3]  Raffaello D'Andrea,et al.  Iterative MILP methods for vehicle-control problems , 2005, IEEE Transactions on Robotics.

[4]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[5]  Sonia Martínez,et al.  Coverage control for mobile sensing networks , 2002, IEEE Transactions on Robotics and Automation.

[6]  Jason R. Marden,et al.  Autonomous Vehicle-Target Assignment: A Game-Theoretical Formulation , 2007 .

[7]  Robert Murphey,et al.  Target-Based Weapon Target Assignment Problems , 2000 .

[8]  M. Tanemura,et al.  Geometrical models of territory. I. Models for synchronous and asynchronous settlement of territories. , 1980, Journal of theoretical biology.

[9]  M. Dufwenberg Game theory. , 2011, Wiley interdisciplinary reviews. Cognitive science.

[10]  Jonathan P. How,et al.  COORDINATION AND CONTROL OF MULTIPLE UAVs , 2002 .

[11]  Jose B. Cruz,et al.  Coordinating networked uninhabited air vehicles for persistent area denial , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[12]  Emilio Frazzoli,et al.  Efficient Routing Algorithms for Multiple Vehicles With no Explicit Communications , 2009, IEEE Transactions on Automatic Control.

[13]  Micha Sharir,et al.  Efficient algorithms for geometric optimization , 1998, CSUR.

[14]  Jeff S. Shamma,et al.  Dynamic fictitious play, dynamic gradient play, and distributed convergence to Nash equilibria , 2005, IEEE Transactions on Automatic Control.

[15]  K.M. Passino,et al.  Distributed Balancing of AAVs for Uniform Surveillance Coverage , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[16]  L. Shapley,et al.  Potential Games , 1994 .

[17]  L. Shapley,et al.  REGULAR ARTICLEPotential Games , 1996 .

[18]  Timothy W. McLain,et al.  Coordinated target assignment and intercept for unmanned air vehicles , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[19]  F. Bullo,et al.  Decentralized algorithms for vehicle routing in a stochastic time-varying environment , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[20]  Dimitris Bertsimas,et al.  A Stochastic and Dynamic Vehicle Routing Problem in the Euclidean Plane , 1991, Oper. Res..

[22]  Corey Schumacher,et al.  Task allocation for wide area search munitions with variable path length , 2003, Proceedings of the 2003 American Control Conference, 2003..

[23]  Said Salhi,et al.  Facility Location: A Survey of Applications and Methods , 1996 .