A Multiagent Task Associated MDP (MTAMDP) Approach to Resource Allocation

This paper contributes to solving effectively a specific type of real-time stochastic resource allocation problem known to be NP-Complete. Its main distinction is the high number of possible interacting actions to execute in a group of tasks. To address this complex resource management problem, an adaptation of the Multiagent Markov Decision Process (MMDP) model which centralizes the computation of interacting resources is proposed. This adaptation is called Multiagent Task Associated Markov Decision Process (MTAMDP) and produces a near-optimal solution policy in a much lower time than a standard MMDP approach. In a MTAMDP, a planning agent computes a policy for each resource, and all these planning agents are coordinated by a central agent. MTAMDPs enable practically solving our NP-Complete problem.

[1]  Stuart J. Russell,et al.  Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.

[2]  Craig Boutilier,et al.  Sequential Optimality and Coordination in Multiagent Systems , 1999, IJCAI.

[3]  Shlomo Zilberstein,et al.  Optimal Scheduling of Dynamic Progressive Processing , 1998, ECAI.

[4]  D. Bertsekas Rollout Algorithms for Constrained Dynamic Programming 1 , 2005 .

[5]  Edmund H. Durfee,et al.  Optimal Resource Allocation and Policy Formulation in Loosely-Coupled Markov Decision Processes , 2004, ICAPS.

[6]  Kee-Eung Kim,et al.  Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.

[7]  R. Bellman,et al.  Dynamic Programming and Markov Processes , 1960 .

[8]  Weixiong Zhang,et al.  Modeling and Solving a Resource Allocation Problem with Soft Constraint Techniques , 2002 .

[9]  Victor R. Lesser,et al.  Generalizing the Partial Global Planning Algorithm , 1992, Int. J. Cooperative Inf. Syst..

[10]  Decision Systems.,et al.  Some Analytical Results for the Dynamic Weapon-Target Allocation Problem* , 1990 .

[11]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[12]  D. Bertsekas Rollout Algorithms for Constrained Dynamic Programming , 2005 .

[13]  Lin Zhang,et al.  Decision-Theoretic Military Operations Planning , 2004, ICAPS.

[14]  D.A. Castanon,et al.  Decomposition techniques for temporal resource allocation , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[15]  Brahim Chaib-draa,et al.  Decomposition techniques for a loosely-coupled resource allocation problem , 2005, IEEE/WIC/ACM International Conference on Intelligent Agent Technology.

[16]  Eugene A. Feinberg,et al.  Constrained Discounted Dynamic Programming , 1996, Math. Oper. Res..

[17]  Jianhui Wu,et al.  Automated resource-driven mission phasing techniques for constrained agents , 2005, AAMAS '05.

[18]  Satinder P. Singh,et al.  How to Dynamically Merge Markov Decision Processes , 1997, NIPS.