Evaluating Emergent Coordination in Multi-Agent Task Allocation Through Causal Inference and Sub-Team Identification

Coordination in multi-agent systems is a vital component in teaming effectiveness. In dynamically changing situations, agent decisions depict emergent coordination strategies from following pre-defined rules to exploiting incentive-driven policies. While multi-agent reinforcement learning shapes team behaviors from experience, interpreting learned coordination strategies offers benefits in understanding complex agent dynamics and further improvement in developing adaptive strategies for evolving and unexpected situations. In this work, we develop an approach to quantitatively measure team coordination by collecting decision time series data, detecting causality between agents, and identifying statistically high coordinated sub-teams in decentralized multi-agent task allocation operations. We focus on multi-agent systems with homogeneous agents and homogeneous tasks as the strategy formation is more ambiguous and challenging than heterogeneous teams with specialized capabilities. Emergent team coordination is then analyzed using rule-based and reinforcement learning-based strategies for task allocation in operations at different demand stages (stress) levels. We also investigate correlation vs. causation and agent over- or under-estimating demand levels.

[1]  B. Epureanu,et al.  Task Allocation with Load Management in Multi-Agent Teams , 2022, 2022 International Conference on Robotics and Automation (ICRA).

[2]  Michael T. Tolston,et al.  Convergent cross sorting for estimating dynamic coupling , 2021, Scientific Reports.

[3]  B. Epureanu,et al.  Impact of Heterogeneity and Risk Aversion on Task Allocation in Multi-Agent Teams , 2021, IEEE Robotics and Automation Letters.

[4]  Dou An,et al.  Coordination Between Individual Agents in Multi-Agent Reinforcement Learning , 2021, AAAI.

[5]  Klaus Diepold,et al.  Multi-agent deep reinforcement learning: a survey , 2021, Artificial Intelligence Review.

[6]  Toshiharu Sugawara,et al.  Analysis of coordinated behavior structures with multi-agent deep reinforcement learning , 2020, Appl. Intell..

[7]  Eric M. S. P. Veith,et al.  Explainable Reinforcement Learning: A Survey , 2020, CD-MAKE.

[8]  M. Gervasio,et al.  Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents' Capabilities and Limitations , 2019, Artif. Intell..

[9]  S. Whiteson,et al.  Deep Coordination Graphs , 2019, ICML.

[10]  Nicholas R. Waytowich,et al.  Effect of cooperative team size on coordination in adaptive multi-agent systems , 2019, Defense + Commercial Sensing.

[11]  Brandon Perelman,et al.  Algorithmically identifying strategies in multi-agent game-theoretic environments , 2019, Defense + Commercial Sensing.

[12]  Guy Lever,et al.  Emergent Coordination Through Competition , 2019, ICLR.

[13]  Cynthia Rudin,et al.  Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead , 2018, Nature Machine Intelligence.

[14]  Toshiharu Sugawara,et al.  Learning Strategic Group Formation for Coordinated Behavior in Adversarial Multi-Agent with Double DQN , 2018, PRIMA.

[15]  Nicholas R. Waytowich,et al.  Measuring collaborative emergent behavior in multi-agent reinforcement learning , 2018, IHSED.

[16]  Shie Mannor,et al.  Graying the black box: Understanding DQNs , 2016, ICML.

[17]  George Sugihara,et al.  Distinguishing time-delayed causal interactions using convergent cross mapping , 2015, Scientific Reports.

[18]  George Sugihara,et al.  Spatial convergent cross mapping to detect causal relationships from short time series. , 2015, Ecology.

[19]  James M. McCracken,et al.  Convergent cross-mapping and pairwise asymmetric inference. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  George Sugihara,et al.  Detecting Causality in Complex Ecosystems , 2012, Science.

[21]  Wenwu Yu,et al.  An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.

[22]  Thomas Schmickl,et al.  Noname manuscript No. (will be inserted by the editor) Analysis of Emergent Symmetry Breaking in Collective Decision Making , 2010 .

[23]  Kristian Kersting,et al.  Multi-Agent Inverse Reinforcement Learning , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[24]  Brian D. Ziebart,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[25]  Chih-hao Hsieh,et al.  Extending Nonlinear Analysis to Short Ecological Time Series , 2007, The American Naturalist.

[26]  E A Leicht,et al.  Community structure in directed networks. , 2007, Physical review letters.

[27]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[28]  M. Newman Analysis of weighted networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[29]  Michail G. Lagoudakis,et al.  Coordinated Reinforcement Learning , 2002, ICML.

[30]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[31]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[32]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[33]  Michael I. Jordan,et al.  Reinforcement Learning by Probability Matching , 1995, NIPS.

[34]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.