Multi-Agent Relational Reinforcement Learning Explorations in Multi-State Coordination Tasks

In this paper we report on using a relational state space in multi-agent reinforcement learning. There is growing evidence in the Reinforcement Learning research community that a relational representation of the state space has many benefits over a propositional one. Complex tasks as planning or information retrieval on the web can be represented more naturally in relational form. Yet, this relational structure has not been exploited for multi-agent reinforcement learning tasks and has only been studied in a single agent context so far. In this paper we explore the powerful possibilities of using Relational Reinforcement Learning (RRL) in complex multi-agent coordination tasks. More precisely, we consider an abstract multi-state coordination problem, which can be considered as a variation and extension of repeated stateless Dispersion Games. Our approach shows that RRL allows to represent a complex state space in a multi-agent environment more compactly and allows for fast convergence of learning agents. Moreover, with this technique, agents are able to make complex interactive models (in the sense of learning from an expert), to predict what other agents will do and generalize over this model. This enables to solve complex multi-agent planning tasks, in which agents need to be adaptive and learn, with more powerful tools.

[1]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[2]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[3]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[4]  Kagan Tumer,et al.  Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.

[5]  Michael P. Wellman,et al.  Experimental Results on Q-Learning for General-Sum Stochastic Games , 2000, ICML.

[6]  Kurt Driessens,et al.  Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner , 2001, ECML.

[7]  Ann Nowé,et al.  Social Agents Playing a Periodical Policy , 2001, ECML.

[8]  Yoav Shoham,et al.  Dispersion games: general definitions and some specific learning results , 2002, AAAI/IAAI.

[9]  Tom Lenaerts,et al.  A selection-mutation model for q-learning in multi-agent systems , 2003, AAMAS '03.

[10]  Mehdi Dastani,et al.  A characterization of sapient agents , 2003, IEMC '03 Proceedings. Managing Technologically Driven Organizations: The Human Side of Innovation and Change (IEEE Cat. No.03CH37502).

[11]  Sandip Sen,et al.  Towards a pareto-optimal solution in general-sum games , 2003, AAMAS '03.

[12]  Eduardo F. Morales,et al.  Learning to fly by combining reinforcement learning with behavioural cloning , 2004, ICML.

[13]  K. Kersting,et al.  Logical Markov Decision Programs and the Convergence of Logical TD(lambda) , 2004, ILP.

[14]  Saso Dzeroski,et al.  Integrating Guidance into Relational Reinforcement Learning , 2004, Machine Learning.

[15]  Maurice Bruynooghe,et al.  Towards Informed Reinforcement Learning , 2004, ICML 2004.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Jan Ramon On the convergence of reinforcement learning using a decision tree learner , 2005, ICML 2005.

[18]  Maurice Bruynooghe,et al.  Multi-agent Relational Reinforcement Learning , 2005, LAMAS.

[19]  De,et al.  Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.