论文信息 - ALMA: Hierarchical Learning for Composite Multi-Agent Tasks

ALMA: Hierarchical Learning for Composite Multi-Agent Tasks

Despite signiﬁcant progress on multi-agent reinforcement learning (MARL) in recent years, coordination in complex domains remains a challenge. Work in MARL often focuses on solving tasks where agents interact with all other agents and entities in the environment; however, we observe that real-world tasks are often composed of several isolated instances of local agent interactions (subtasks), and each agent can meaningfully focus on one subtask to the exclusion of all else in the environment. In these composite tasks , successful policies can often be decomposed into two levels of decision-making: agents are allocated to speciﬁc subtasks and each agent acts productively towards their assigned subtask alone. This decomposed decision making provides a strong structural inductive bias, signiﬁcantly reduces agent observation spaces, and encourages subtask-speciﬁc policies to be reused and composed during training, as opposed to treating each new composition of subtasks as unique. We introduce ALMA, a general learning method for taking advantage of these structured tasks. ALMA simultaneously learns a high-level subtask allocation policy and low-level agent policies. We demonstrate that ALMA learns sophisticated coordination behavior in a number of challenging environments, outperforming strong baselines. ALMA’s modularity also enables it to better generalize to new environment conﬁgurations. Finally, we ﬁnd that while ALMA can integrate separately trained allocation and action policies, the best performance is obtained only by training all components jointly.

Fei Sha | Robby Costales | Shariq Iqbal

[1] Yuval Tassa,et al. From Motor Control to Team Play in Simulated Humanoid Football , 2021, Sci. Robotics.

[2] P. Stone,et al. Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition , 2021, ICML.

[3] Chongjie Zhang,et al. QPLEX: Duplex Dueling Multi-Agent Q-Learning , 2020, ICLR.

[4] Shimon Whiteson,et al. Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning , 2020, ICML.

[5] Yoshua Bengio,et al. Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon , 2018, Eur. J. Oper. Res..

[6] Andriy Mnih,et al. Q-Learning in enormous action spaces via amortized approximate maximization , 2020, ArXiv.

[7] H. Zha,et al. Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery , 2019, AAMAS.

[8] David Isele,et al. CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning , 2018, ICLR.

[9] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.

[10] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[11] Alessandro Lazaric,et al. A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning , 2019, NeurIPS.