Advice-Exchange Between Evolutionary Algorithms and Reinforcement Learning Agents: Experiments in the Pursuit Domain

This research aims at studying the effects of exchanging information during the learning process in Multiagent Systems. The concept of advice-exchange, introduced in previous contributions, consists in enabling an agent to request extra feedback, in the form of episodic advice, from other agents that are solving similar problems. The work that was previously focused on the exchange of information between agents that were solving detached problems is now concerned with groups of learning-agents that share the same environment. This change added new difficulties to the task. The experiments reported below were conducted to detect the causes and correct the shortcomings that emerged when moving from environments where agents worked in detached problems to those where agents are interacting in the same environment. New concepts, such as self confidence, trust and advisor preference are introduced in this text.

[1]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[2]  Sandip Sen,et al.  Sharing a concept , 2002 .

[3]  Craig Boutilier,et al.  Implicit Imitation in Multiagent Reinforcement Learning , 1999, ICML.

[4]  Sandip Sen,et al.  Strongly Typed Genetic Programming in Evolving Cooperation Strategies , 1995, ICGA.

[5]  X. Yao Evolving Artificial Neural Networks , 1999 .

[6]  Craig Boutilier,et al.  Imitation and Reinforcement Learning in Agents with Heterogeneous Actions , 2001, Canadian Conference on AI.

[7]  Luís Nunes,et al.  On Learning by Exchanging Advice , 2002, ArXiv.

[8]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[9]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[10]  Sandip Sen,et al.  Individual learning of coordination knowledge , 1998, J. Exp. Theor. Artif. Intell..

[11]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[12]  R. P. Salustowicz,et al.  A Genetic algorithm for the topological optimization of neural networks , 1995 .

[13]  Steven D. Whitehead,et al.  A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning , 1991, AAAI.

[14]  J. Clouse Chapter 22 – The Role of Training in Reinforcement Learning , 1997 .

[15]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[16]  Paul E. Utgoff,et al.  On integrating apprentice learning and reinforcement learning , 1996 .

[17]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[18]  M. Benda,et al.  On Optimal Cooperation of Knowledge Sources , 1985 .

[19]  Paul E. Utgoff,et al.  Two Kinds of Training Information For Evaluation Function Learning , 1991, AAAI.

[20]  Paul E. Utgoff,et al.  A Teaching Method for Reinforcement Learning , 1992, ML.

[21]  Eugénio C. Oliveira,et al.  Advice-exchange in heterogeneous groups of learning agents , 2003, AAMAS '03.

[22]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[23]  Katia P. Sycara,et al.  Evolution of Goal-Directed Behavior from Limited Information in a Complex Environment , 1999, GECCO.