Argumentation Accelerated Reinforcement Learning for Cooperative Multi-Agent Systems

Multi-Agent Learning is a complex problem, especially in real-time systems. We address this problem by introducing Argumentation Accelerated Reinforcement Learning (AARL), which provides a methodology for defining heuristics, represented by arguments, and incorporates these heuristics into Reinforcement Learning (RL) by using reward shaping. We define AARL via argumentation and prove that it can coordinate independent cooperative agents that have a shared goal but need to perform different actions. We test AARL empirically in a popular RL testbed, RoboCup Takeaway, and show that it significantly improves upon standard RL.

[1]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[2]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[3]  Ivan Bratko,et al.  Argument based machine learning , 2006, Artif. Intell..

[4]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[5]  Francesca Toni,et al.  Argumentation Dialogues for Two-Agent Conflict Resolution , 2012, COMMA.

[6]  Phan Minh Dung,et al.  On the Acceptability of Arguments and its Fundamental Role in Nonmonotonic Reasoning, Logic Programming and n-Person Games , 1995, Artif. Intell..

[7]  Mong-Li Lee,et al.  Coordination guided reinforcement learning , 2012, AAMAS.

[8]  Robert Craven,et al.  Argumentation-based reinforcement learning for robocup soccer takeaway , 2014, AAMAS.

[9]  M. Grzes,et al.  Plan-based reward shaping for reinforcement learning , 2008, 2008 4th International IEEE Conference Intelligent Systems.

[10]  Michail G. Lagoudakis,et al.  Coordinated Reinforcement Learning , 2002, ICML.

[11]  Sam Devlin,et al.  An Empirical Study of Potential-Based Reward Shaping and Advice in Complex, Multi-Agent Systems , 2011, Adv. Complex Syst..

[12]  Peter Stone,et al.  Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[13]  Richard S. Sutton,et al.  Dimensions of Reinforcement Learning , 1998 .

[14]  Atil Iscen,et al.  A new perspective to the keepaway soccer: the takers , 2008, AAMAS.

[15]  Yang Gao,et al.  Argumentation accelerated reinforcement learning , 2014 .

[16]  Garrison W. Cottrell,et al.  Principled Methods for Advising Reinforcement Learning Agents , 2003, ICML.

[17]  Daniel Kudenko,et al.  Reinforcement learning of coordination in cooperative multi-agent systems , 2002, AAAI/IAAI.

[18]  Sridhar Mahadevan,et al.  Hierarchical Multiagent Reinforcement Learning , 2004 .

[19]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[20]  Sridhar Mahadevan,et al.  Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.

[21]  Bhaskara Marthi,et al.  Automatic shaping and decomposition of reward functions , 2007, ICML '07.

[22]  Trevor J. M. Bench-Capon Persuasion in Practical Argument Using Value-based Argumentation Frameworks , 2003, J. Log. Comput..