ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

The role concept provides a useful tool to design and understand complex multi-agent systems, which allows agents with a similar role to share similar behaviors. However, existing role-based methods use prior domain knowledge and predefine role structures and behaviors. In contrast, multi-agent reinforcement learning (MARL) provides flexibility and adaptability, but less efficiency in complex tasks. In this paper, we synergize these two paradigms and propose a role-oriented MARL framework (ROMA). In this framework, roles are emergent, and agents with similar roles tend to share their learning and to be specialized on certain sub-tasks. To this end, we construct a stochastic role embedding space by introducing two novel regularizers and conditioning individual policies on roles. Experiments show that our method can learn specialized, dynamic, and identifiable roles, which help our method push forward the state of the art on the StarCraft II micromanagement benchmark. Demonstrative videos are available at this https URL.

[1]  Zongqing Lu,et al.  Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.

[2]  Joelle Pineau,et al.  TarMAC: Targeted Multi-Agent Communication , 2018, ICML.

[3]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[4]  Igor Mordatch,et al.  Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.

[5]  MengChu Zhou,et al.  Role-Based Multi-Agent Systems , 2008 .

[6]  Salvatore Gaglio,et al.  The PASSI and Agile PASSI MAS Meta-models Compared with a Unifying Proposal , 2005, CEEMAS.

[7]  Gerard de Melo,et al.  Incorporating Pragmatic Reasoning Communication into Emergent Language , 2020, NeurIPS.

[8]  Jorge J. Gómez-Sanz,et al.  Agent Oriented Software Engineering with INGENIAS , 2003, CEEMAS.

[9]  Scott A. DeLoach,et al.  O-MaSE: a customisable approach to designing and building complex, adaptive multi-agent systems , 2010, Int. J. Agent Oriented Softw. Eng..

[10]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[11]  Guy Lever,et al.  Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[12]  Peng Peng,et al.  Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.

[13]  Nicholas R. Jennings,et al.  The Gaia Methodology for Agent-Oriented Analysis and Design , 2000, Autonomous Agents and Multi-Agent Systems.

[14]  Fausto Giunchiglia,et al.  Tropos: An Agent-Oriented Software Development Methodology , 2004, Autonomous Agents and Multi-Agent Systems.

[15]  Toru Ishida,et al.  Role-Based Modeling for Designing Agent Behavior in Self-Organizing Multi-Agent Systems , 2018, Int. J. Softw. Eng. Knowl. Eng..

[16]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[17]  Marian H. Nodine,et al.  A Metamodel for Agents, Roles, and Groups , 2004, AOSE.

[18]  Jacques Ferber,et al.  From Agents to Organizations: An Organizational View of Multi-agent Systems , 2003, AOSE.

[19]  Raphaël Jeanson,et al.  Emergence of division of labour in halictine bees: contributions of social interactions and behavioural variance , 2005, Animal Behaviour.

[20]  Jun Wang,et al.  Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.

[21]  Carlos Guestrin,et al.  Multiagent Planning with Factored MDPs , 2001, NIPS.

[22]  Andrea Omicini,et al.  SODA: Societies and Infrastructures in the Analysis and Design of Agent-Based Systems , 2000, AOSE.

[23]  Zongqing Lu,et al.  Learning Fairness in Multi-Agent Systems , 2019, NeurIPS.

[24]  Shimon Whiteson,et al.  Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.

[25]  Michael Becht,et al.  ROPE: role oriented programming environment for multiagent systems , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[26]  Amanpreet Singh,et al.  Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks , 2018, ICLR.

[27]  Ann Nowé,et al.  Learning to Coordinate with Coordination Graphs in Repeated Single-Stage Multi-Agent Decision Problems , 2018, ICML.

[28]  Chongjie Zhang,et al.  Learning Nearly Decomposable Value Functions Via Communication Minimization , 2019, ICLR.

[29]  Manuela M. Veloso,et al.  Task Decomposition, Dynamic Role Assignment, and Low-Bandwidth Communication for Real-Time Strategic Teamwork , 1999, Artif. Intell..

[30]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[31]  Yi Wu,et al.  Influence-Based Multi-Agent Exploration , 2020, ICLR.

[32]  Victor R. Lesser,et al.  Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs , 2011, AAAI.

[33]  Pieter Abbeel,et al.  Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.

[34]  Wenwu Yu,et al.  An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.

[35]  Nikolaos I. Spanoudakis,et al.  Using ASEME Methodology for Model-Driven Agent Systems Development , 2010, AOSE.

[36]  Yann LeCun,et al.  GLoMo: Unsupervised Learning of Transferable Relational Graphs , 2018, NeurIPS.

[37]  Vincent Hilaire,et al.  Handbook on Agent-Oriented Design Processes , 2014, Springer Berlin Heidelberg.

[38]  Ming Tan,et al.  Multi-Agent Reinforcement Learning: Independent versus Cooperative Agents , 1997, ICML.

[39]  Peter Vrancx,et al.  Game Theory and Multi-agent Reinforcement Learning , 2012, Reinforcement Learning.

[40]  D. Gordon The Organization of Work in Social Insect , 2003 .

[41]  D. Gordon The organization of work in social insect colonies , 1996, Nature.

[42]  Nicolas Usunier,et al.  Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks , 2016, ArXiv.

[43]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[44]  Yung Yi,et al.  QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.

[45]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[46]  Shimon Whiteson,et al.  Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2020, J. Mach. Learn. Res..

[47]  Alexander Peysakhovich,et al.  Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.

[48]  Stefan Lee,et al.  Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Frans A. Oliehoek,et al.  A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.

[50]  Shimon Whiteson,et al.  The StarCraft Multi-Agent Challenge , 2019, AAMAS.

[51]  Shimon Whiteson,et al.  QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[52]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[53]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[54]  Yedid Hoshen,et al.  VAIN: Attentional Multi-agent Predictive Modeling , 2017, NIPS.

[55]  Fei Sha,et al.  Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.

[56]  Michael Winikoff,et al.  Prometheus: a methodology for developing intelligent agents , 2002, AAMAS '02.

[57]  Shimon Whiteson,et al.  MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.

[58]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[59]  Hoong Chuin Lau,et al.  Credit Assignment For Collective Multiagent RL With Global Rewards , 2018, NeurIPS.

[60]  Tom Schaul,et al.  StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[61]  Jayesh K. Gupta,et al.  Evaluating Generalization in Multiagent Systems using Agent-Interaction Graphs , 2018, AAMAS.

[62]  Taeyoung Lee,et al.  Learning to Schedule Communication in Multi-agent Reinforcement Learning , 2019, ICLR.

[63]  Adam Smith,et al.  The Wealth of Nations , 1999 .