Learning to Cooperate using Hierarchical Reinforcement Learning

In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-agent RL framework, and present a hierarchical multiagent RL algorithm called Cooperative HRL. The fundamental property of our approach is that the use of hierarchy allows agents to learn coordination faster by sharing information at the level of subtasks, rather than attempting to learn coordination at the level of primitive actions. We study the performance of the Cooperative HRL algorithm using a fouragent automated guided vehicle (AGV) scheduling problem. We also address the issue of rational communication behavior among autonomous agents in this paper. The goal is for agents to learn both action and communication policies that together optimize the task given a communication cost. We extend our multi-agent HRL framework to include communication decisions and present a cooperative multi-agent HRL algorithm called COM-Cooperative HRL. We demonstrate the eciency of this algorithm as well as the relation between the communication cost and the learned communication policy using a multi-agent taxi problem.