Multiagent allocation of Markov decision process tasks

Producing task assignments for multiagent teams often leads to an exponential growth in the decision space as the number of agents and objectives increases. One approach to finding a task assignment is to model the agents and the environment as a single Markov decision process, and solve the planning problem using standard MDP techniques. However, both exact and approximate MDP solvers in this environment struggle to produce assignments even for problems involving few agents and objectives. Conversely, problem formulations based upon mathematical programming typically scale well with the problem size at the expense of requiring comparatively simple agent and task models. This paper combines these two formulations by modeling task and agent dynamics using MDPs, and then using optimization techniques to solve the combinatorial problem of assigning tasks to agents. The computational complexity of the resulting algorithm is polynomial in the number of tasks and is constant in the number of agents. Simulation results are provided which highlight the performance of the algorithm in a grid world mobile target surveillance scenario, while demonstrating that these techniques can be extended to even larger tasking domains.

[1]  Brahim Chaib-draa,et al.  A Multiagent Task Associated MDP (MTAMDP) Approach to Resource Allocation , 2006, AAAI Spring Symposium: Distributed Plan and Schedule Management.

[2]  Dimitris Bertsimas,et al.  Optimization over integers , 2005 .

[3]  Sameera S. Ponda Robust distributed planning strategies for autonomous multi-agent teams , 2012 .

[4]  Ronen I. Brafman,et al.  Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.

[5]  Jonathan P. How,et al.  Experimental Demonstration of Multi-Agent Learning and Planning under Uncertainty for Persistent Missions with Automated Battery Management , 2012 .

[6]  Prasad Tadepalli,et al.  Solving multiagent assignment Markov decision processes , 2009, AAMAS.

[7]  Joshua D. Redding,et al.  Approximate multi-agent planning in dynamic and uncertain environments , 2011 .

[8]  Han-Lim Choi,et al.  Allowing non-submodular score functions in distributed task allocation , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[9]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[10]  Michail G. Lagoudakis,et al.  Coordinated Reinforcement Learning , 2002, ICML.

[11]  Han-Lim Choi,et al.  Consensus-Based Decentralized Auctions for Robust Task Allocation , 2009, IEEE Transactions on Robotics.

[12]  Leslie Pack Kaelbling,et al.  On the Complexity of Solving Markov Decision Problems , 1995, UAI.

[13]  Alborz Geramifard,et al.  Online Discovery of Feature Dependencies , 2011, ICML.

[14]  Andreas Krause,et al.  Near-optimal Observation Selection using Submodular Functions , 2007, AAAI.

[15]  Sameera S. Ponda,et al.  Distributed chance-constrained task allocation for autonomous multi-agent teams , 2012, 2012 American Control Conference (ACC).

[16]  Satinder P. Singh,et al.  How to Dynamically Merge Markov Decision Processes , 1997, NIPS.

[17]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[18]  Alborz Geramifard,et al.  UAV cooperative control with stochastic risk models , 2011, Proceedings of the 2011 American Control Conference.

[19]  Kee-Eung Kim,et al.  Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.

[20]  Edmund H. Durfee,et al.  Computationally-efficient combinatorial auctions for resource allocation in weakly-coupled MDPs , 2005, AAMAS '05.

[21]  Uriel Feige,et al.  Approximation algorithms for allocation problems: Improving the factor of 1 - 1/e , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[22]  D. J. White,et al.  A Survey of Applications of Markov Decision Processes , 1993 .