论文信息 - Learning with Whom to Communicate Using Relational Reinforcement Learning

Learning with Whom to Communicate Using Relational Reinforcement Learning

Relational reinforcement learning (RRL) has emerged in the machine learning community as a new promising subfield of reinforcement learning (RL) (e.g. [1]). It upgrades RL techniques by using relational representations for states, actions and learned value-functions or policies to allow more natural representations and abstractions of complex tasks. This leads to a serious state space reduction, allowing to better generalize and infer new knowledge.

[1] Peter Stone,et al. Layered Learning in Multiagent Systems , 1997, AAAI/IAAI.

[2] Martijn van Otterlo,et al. The logic of adaptive behavior : knowledge representation and algorithms for the Markov decision process framework in first-order domains , 2008 .

[3] Ioan Alfred Letia,et al. Developing Collaborative Golog Agents by Reinforcement Learning , 2002, Int. J. Artif. Intell. Tools.

[4] Mehdi Dastani,et al. A characterization of sapient agents , 2003, IEMC '03 Proceedings. Managing Technologically Driven Organizations: The Human Side of Innovation and Change (IEEE Cat. No.03CH37502).

[5] Thomas Lukasiewicz,et al. Game theoretic Golog under partial observability , 2005, AAMAS '05.

[6] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[7] Karl Tuyls,et al. Analyzing Multi-agent Reinforcement Learning Using Evolutionary Dynamics , 2004, ECML.

[8] Eduardo F. Morales,et al. Learning to fly by combining reinforcement learning with behavioural cloning , 2004, ICML.

[9] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[11] Sandip Sen,et al. Towards a pareto-optimal solution in general-sum games , 2003, AAMAS '03.

[12] Robert Givan,et al. Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.

[13] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.

[14] Maurice Bruynooghe,et al. Multi-agent Relational Reinforcement Learning , 2005, LAMAS.

[15] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.

[16] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.

[17] Karl Tuyls,et al. Theoretical Advantages of Lenient Learners: An Evolutionary Game Theoretic Perspective , 2008, J. Mach. Learn. Res..

[18] Amal El Fallah Seghrouchni,et al. Learning in BDI Multi-agent Systems , 2004, CLIMA.

[19] Luc De Raedt,et al. Bellman goes relational , 2004, ICML.

[20] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[21] Robert Givan,et al. Relational Reinforcement Learning: An Overview , 2004, ICML 2004.

[22] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.

[23] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[24] K. Tuyls,et al. Multi-Agent Relational Reinforcement Learning Explorations in Multi-State Coordination Tasks , 2006 .

[25] Kurt Driessens,et al. Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner , 2001, ECML.

[26] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[27] Ann Nowé,et al. Social Agents Playing a Periodical Policy , 2001, ECML.

[28] Michael P. Wellman,et al. Experimental Results on Q-Learning for General-Sum Stochastic Games , 2000, ICML.

[29] Kagan Tumer,et al. Collective Intelligence and Braess' Paradox , 2000, AAAI/IAAI.

[30] Ioan Alfred Letia,et al. Developing collaborative Golog agents by reinforcement learning , 2001, Proceedings 13th IEEE International Conference on Tools with Artificial Intelligence. ICTAI 2001.

[31] Marco Wiering,et al. Explorations in efficient reinforcement learning , 1999 .

[32] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[33] Luc De Raedt,et al. Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[34] Tom Lenaerts,et al. A selection-mutation model for q-learning in multi-agent systems , 2003, AAMAS '03.

[35] Saso Dzeroski,et al. Integrating Guidance into Relational Reinforcement Learning , 2004, Machine Learning.

[36] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[37] Sylvie Thiébaux,et al. Exploiting First-Order Regression in Inductive Policy Selection , 2004, UAI.