论文信息 - Learning Task Allocation via Multi-level Policy Gradient Algorithm with Dynamic Learning Rate

Learning Task Allocation via Multi-level Policy Gradient Algorithm with Dynamic Learning Rate

Task allocation is the process of assigning tasks to appropriate resources. To achieve scalability, it is common to use a network of agents (also called mediators) that handles task allocation. This work proposes a novel multi-level policy gradient algorithm to solve the local decision problem at each mediator agent. The higher level policy stochastically chooses a task decomposition. The lower level policy assigns subtasks to neighboring agents also stochastically. Agents learn autonomously, cooperatively, and concurrently to increase system performance. No state information is used except for the task being allocated. Furthermore, the algorithm dynamically adjusts the learning rate, to speed up convergence, using the ratio of action values. Experimental results show how our proposed solution outperforms other deterministic approaches by balancing the load over resources and converging faster to better policies.

Victor Lesser | Sherief Abdallah

[1] Edmund H. Durfee,et al. Optimal Resource Allocation and Policy Formulation in Loosely-Coupled Markov Decision Processes , 2004, ICAPS.

[2] Lex Weaver,et al. A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.

[3] Abdel-Illah Mouaddib,et al. Task selection problem under uncertainty as decision-making , 2002, AAMAS '02.

[4] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[5] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[6] Ian T. Foster,et al. Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[7] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.

[8] Risto Miikkulainen,et al. Confidence-based Q-Routing: An on-line adaptive network routing algorithm , 1998 .