Decentralized coordination via task decomposition and reward shaping
暂无分享,去创建一个
In this work, we introduce a method for decentralized coordination in cooperative multiagent multi-task problems where the subtasks and agents are homogeneous. Using the method proposed, the agents cooperate at the high level task selection using the knowledge they gather by learning subtasks. We introduce a subtask selection method for single agent multi-task MDPs and we extend the work to multiagent multi-task MDPs by using reward shaping at the subtask level to coordinate the agents. Our results on a multi-rover problem show that agents which use the combination of task decomposition and subtask based difference rewards result in significant improvement both in terms of learning speed, and converged policies.
[1] Kagan Tumer,et al. Analyzing and visualizing multiagent rewards in dynamic and stochastic domains , 2008, Autonomous Agents and Multi-Agent Systems.
[2] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[4] Prasad Tadepalli,et al. Solving multiagent assignment Markov decision processes , 2009, AAMAS.