论文信息 - Learning with Whom to Communicate Using Relational Reinforcement Learning

Learning with Whom to Communicate Using Relational Reinforcement Learning

Relational reinforcement learning is a promising new direction within reinforcement learning research. It upgrades reinforcement learning techniques by using relational representations for states, actions and learned value-functions or policies to allow more natural representations and abstractions of complex tasks. Multi-agent systems present a good example of such a complex task and are often characterized by their relational structure. In this paper, we show how relational reinforcement learning could be a useful tool for learning in multi agent systems and study this approach in more detail on one aspect of multi-agent systems, i.e., on learning a communication policy for cooperative systems (e.g. resource distribution). We perform a number of exploratory experiments that highlight the conditions in which relational representations are beneficial.

[1] Eduardo F. Morales,et al. Learning to fly by combining reinforcement learning with behavioural cloning , 2004, ICML.

[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3] Sandip Sen,et al. Towards a pareto-optimal solution in general-sum games , 2003, AAMAS '03.

[4] Robert Givan,et al. Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.

[5] Peter Stone,et al. Layered learning in multiagent systems - a winning approach to robotic soccer , 2000, Intelligent robotics and autonomous agents.

[6] Martijn van Otterlo,et al. The logic of adaptive behavior : knowledge representation and algorithms for the Markov decision process framework in first-order domains , 2008 .

[7] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[8] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[9] Marco Wiering,et al. Explorations in efficient reinforcement learning , 1999 .

[10] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[11] Tom Lenaerts,et al. A selection-mutation model for q-learning in multi-agent systems , 2003, AAMAS '03.

[12] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.

[13] Saso Dzeroski,et al. Integrating Guidance into Relational Reinforcement Learning , 2004, Machine Learning.

[14] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[15] Maurice Bruynooghe,et al. Multi-agent Relational Reinforcement Learning , 2005, LAMAS.

[16] Ann Nowé,et al. Social Agents Playing a Periodical Policy , 2001, ECML.

[17] Michael P. Wellman,et al. Experimental Results on Q-Learning for General-Sum Stochastic Games , 2000, ICML.

[18] Kagan Tumer,et al. Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.

[19] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.

[20] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[21] Luc De Raedt,et al. Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[22] Robert Givan,et al. Relational Reinforcement Learning: An Overview , 2004, ICML 2004.

[23] K. Tuyls,et al. Multi-Agent Relational Reinforcement Learning Explorations in Multi-State Coordination Tasks , 2006 .

[24] Kurt Driessens,et al. Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner , 2001, ECML.

[25] Karl Tuyls,et al. Analyzing Multi-agent Reinforcement Learning Using Evolutionary Dynamics , 2004, ECML.

[26] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[27] Ioan Alfred Letia,et al. Developing Collaborative Golog Agents by Reinforcement Learning , 2002, Int. J. Artif. Intell. Tools.

[28] Mehdi Dastani,et al. A characterization of sapient agents , 2003, IEMC '03 Proceedings. Managing Technologically Driven Organizations: The Human Side of Innovation and Change (IEEE Cat. No.03CH37502).

[29] Thomas Lukasiewicz,et al. Game theoretic Golog under partial observability , 2005, AAMAS '05.

[30] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[31] Amal El Fallah Seghrouchni,et al. Learning in BDI Multi-agent Systems , 2004, CLIMA.

[32] Sylvie Thiébaux,et al. Exploiting First-Order Regression in Inductive Policy Selection , 2004, UAI.

[33] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.