论文信息 - ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

The role concept provides a useful tool to design and understand complex multi-agent systems, which allows agents with a similar role to share similar behaviors. However, existing role-based methods use prior domain knowledge and predefine role structures and behaviors. In contrast, multi-agent reinforcement learning (MARL) provides flexibility and adaptability, but less efficiency in complex tasks. In this paper, we synergize these two paradigms and propose a role-oriented MARL framework (ROMA). In this framework, roles are emergent, and agents with similar roles tend to share their learning and to be specialized on certain sub-tasks. To this end, we construct a stochastic role embedding space by introducing two novel regularizers and conditioning individual policies on roles. Experiments show that our method can learn specialized, dynamic, and identifiable roles, which help our method push forward the state of the art on the StarCraft II micromanagement benchmark. Demonstrative videos are available at this https URL.

Chongjie Zhang | Heng Dong | Tonghan Wang | Victor Lesser

[1] Zongqing Lu,et al. Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.

[2] Joelle Pineau,et al. TarMAC: Targeted Multi-Agent Communication , 2018, ICML.

[3] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[4] Igor Mordatch,et al. Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.

[5] MengChu Zhou,et al. Role-Based Multi-Agent Systems , 2008 .

[6] Salvatore Gaglio,et al. The PASSI and Agile PASSI MAS Meta-models Compared with a Unifying Proposal , 2005, CEEMAS.

[7] Gerard de Melo,et al. Incorporating Pragmatic Reasoning Communication into Emergent Language , 2020, NeurIPS.

[8] Jorge J. Gómez-Sanz,et al. Agent Oriented Software Engineering with INGENIAS , 2003, CEEMAS.

[9] Scott A. DeLoach,et al. O-MaSE: a customisable approach to designing and building complex, adaptive multi-agent systems , 2010, Int. J. Agent Oriented Softw. Eng..

[10] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.

[11] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[12] Peng Peng,et al. Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.

[13] Nicholas R. Jennings,et al. The Gaia Methodology for Agent-Oriented Analysis and Design , 2000, Autonomous Agents and Multi-Agent Systems.

[14] Fausto Giunchiglia,et al. Tropos: An Agent-Oriented Software Development Methodology , 2004, Autonomous Agents and Multi-Agent Systems.

[15] Toru Ishida,et al. Role-Based Modeling for Designing Agent Behavior in Self-Organizing Multi-Agent Systems , 2018, Int. J. Softw. Eng. Knowl. Eng..

[16] R. Zemel,et al. Neural Relational Inference for Interacting Systems , 2018, ICML.

[17] Marian H. Nodine,et al. A Metamodel for Agents, Roles, and Groups , 2004, AOSE.

[18] Jacques Ferber,et al. From Agents to Organizations: An Organizational View of Multi-agent Systems , 2003, AOSE.

[19] Raphaël Jeanson,et al. Emergence of division of labour in halictine bees: contributions of social interactions and behavioural variance , 2005, Animal Behaviour.

[20] Jun Wang,et al. Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.

[21] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.

[22] Andrea Omicini,et al. SODA: Societies and Infrastructures in the Analysis and Design of Agent-Based Systems , 2000, AOSE.

[23] Zongqing Lu,et al. Learning Fairness in Multi-Agent Systems , 2019, NeurIPS.

[24] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.

[25] Michael Becht,et al. ROPE: role oriented programming environment for multiagent systems , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[26] Amanpreet Singh,et al. Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks , 2018, ICLR.

[27] Ann Nowé,et al. Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems , 2018, ICML.

[28] Chongjie Zhang,et al. Learning Nearly Decomposable Value Functions Via Communication Minimization , 2019, ICLR.

[29] Manuela M. Veloso,et al. Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[30] I JordanMichael,et al. Graphical Models, Exponential Families, and Variational Inference , 2008 .

[31] Yi Wu,et al. Influence-Based Multi-Agent Exploration , 2020, ICLR.

[32] Victor R. Lesser,et al. Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs , 2011, AAAI.

[33] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.

[34] Wenwu Yu,et al. An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.

[35] Nikolaos I. Spanoudakis,et al. Using ASEME Methodology for Model-Driven Agent Systems Development , 2010, AOSE.

[36] Yann LeCun,et al. GLoMo: Unsupervised Learning of Transferable Relational Graphs , 2018, NeurIPS.

[37] Vincent Hilaire,et al. Handbook on Agent-Oriented Design Processes , 2014, Springer Berlin Heidelberg.