Effective Approximations for Multi-Robot Coordination in Spatially Distributed Tasks

Although multi-robot systems have received substantial research attention in recent years, multi-robot coordination still remains a difficult task. Especially, when dealing with spatially distributed tasks and many robots, central control quickly becomes infeasible due to the exponential explosion in the number of joint actions and states. We propose a general algorithm that allows for distributed control, that overcomes the exponential growth in the number of joint actions by aggregating the effect of other agents in the system into a probabilistic model, called subjective approximations, and then choosing the best response. We show for a multi-robot grid world how the algorithm can be implemented in the well studied Multiagent Markov Decision Process framework, as a sub-class called spatial task allocation problems (SPATAPs). In this framework, we show how to tackle SPATAPs using online, distributed planning by combining subjective agent approximations with restriction of attention to current tasks in the world. An empirical evaluation shows that the combination of both strategies allows to scale to very large problems, while providing near-optimal solutions.

[1]  S. Rosenschein,et al.  On social laws for artificial agent societies: off-line design , 1996 .

[2]  Edmund H. Durfee,et al.  A decision-theoretic characterization of organizational influences , 2012, AAMAS.

[3]  Jesse Hoey,et al.  SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.

[4]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[5]  Craig Boutilier,et al.  Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.

[6]  Claudia V. Goldman,et al.  Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[7]  Takashi Ikegami,et al.  Insect-Inspired Robot Coordination : Foraging and Coverage , 2014 .

[8]  Shobha Venkataraman,et al.  Context-specific multiagent coordination and planning with factored MDPs , 2002, AAAI/IAAI.

[9]  Yu Zhang,et al.  Task allocation with executable coalitions in multirobot tasks , 2012, 2012 IEEE International Conference on Robotics and Automation.

[10]  Manuela M. Veloso,et al.  Learning of coordination: exploiting sparse interactions in multiagent systems , 2009, AAMAS.

[11]  Olivier Buffet,et al.  Exploiting separability in multiagent planning with continuous-state MDPs , 2014, AAMAS.

[12]  Francisco S. Melo,et al.  Interaction-driven Markov games for decentralized multiagent planning under uncertainty , 2008, AAMAS.

[13]  Peter Vrancx,et al.  Learning multi-agent state space representations , 2010, AAMAS.

[14]  Olivier Guéant,et al.  Mean Field Games and Applications , 2011 .

[15]  Laurent Jeanpierre,et al.  Coordinated Multi-Robot Exploration Under Communication Constraints Using Decentralized Markov Decision Processes , 2012, AAAI.

[16]  Steven Okamoto,et al.  Dynamic Multi-Agent Task Allocation with Spatial and Temporal Constraints , 2014, AAAI.

[17]  Aníbal Ollero,et al.  Decentralized multi-robot cooperation with auctioned POMDPs , 2012, 2012 IEEE International Conference on Robotics and Automation.

[18]  Han-Lim Choi,et al.  Consensus-Based Decentralized Auctions for Robust Task Allocation , 2009, IEEE Transactions on Robotics.

[19]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[20]  P. J. Gmytrasiewicz,et al.  A Framework for Sequential Planning in Multi-Agent Settings , 2005, AI&M.

[21]  Nikos A. Vlassis,et al.  Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[22]  Andrew W. Moore,et al.  Distributed Value Functions , 1999, ICML.

[23]  Dejan Pangercic,et al.  Robotic roommates making pancakes , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[24]  Yifeng Zeng,et al.  Graphical models for interactive POMDPs: representations and solutions , 2009, Autonomous Agents and Multi-Agent Systems.

[25]  Mauro Birattari,et al.  Self-organized task allocation to sequentially interdependent tasks in swarm robotics , 2012, Autonomous Agents and Multi-Agent Systems.

[26]  Magnus Egerstedt,et al.  Adaptive look-ahead for robotic navigation in unknown environments , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Arnaud Doniec,et al.  Scaling Up Decentralized MDPs Through Heuristic Search , 2012, UAI.

[28]  N. LEMMENS,et al.  Stigmergic Landmark Optimization , 2012, Adv. Complex Syst..

[29]  Hoong Chuin Lau,et al.  Lagrangian Relaxation for Large-Scale Multi-agent Planning , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[30]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[31]  Shih-Fen Cheng,et al.  Uncertain Congestion Games with Assorted Human Agent Populations , 2012, UAI.

[32]  Shih-Fen Cheng,et al.  Decision Support for Agent Populations in Uncertain and Congested Environments , 2012, AAAI.

[33]  Edmund H. Durfee,et al.  Resource-Driven Mission-Phasing Techniques for Constrained Agents in Stochastic Environments , 2010, J. Artif. Intell. Res..

[34]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.