Exchanging Advice and Learning to Trust

One of the most important features of “intelligent behaviour” is the ability to learn from experience. The introduction of Multiagent Systems brings new challenges to the research in Machine Learning. New difficulties, but also new advantages, appear when learning takes place in an environment in which agents can communicate and cooperate. The main question that drives this work is “How can agents benefit from communication with their peers during the learning process to improve their individual and global performances? ” We are particularly interested in environments where speed and band-width limitations do not allow highly structured communication, and where learning agents may use different algorithms. The concept of advice-exchange, which started out as mixture of reinforced and supervised learning procedures, is developing into a meta-learning architecture that allows learning agents to improve their learning skills by exchanging information with their peers. This paper reports the latest experiments and results in this subject.

[1]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[2]  Luís Nunes,et al.  On Learning by Exchanging Advice , 2002, ArXiv.

[3]  David Zipser,et al.  Feature Discovery by Competive Learning , 1986, Cogn. Sci..

[4]  Paul E. Utgoff,et al.  On integrating apprentice learning and reinforcement learning , 1996 .

[5]  Sebastian Thrun,et al.  Lifelong robot learning , 1993, Robotics Auton. Syst..

[6]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[7]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[8]  Craig Boutilier,et al.  Implicit Imitation in Multiagent Reinforcement Learning , 1999, ICML.

[9]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[10]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[11]  Sandip Sen,et al.  Learning to Coordinate without Sharing Information , 1994, AAAI.

[12]  Marco Colombetti,et al.  The Role of the Trainer in Reinforcement Learning , 1994, COLT 1994.

[13]  D. Vengerov,et al.  An Empirical Model of Factor Adjustment Dynamics , 2006 .

[14]  Katia P. Sycara,et al.  Evolution of Goal-Directed Behavior from Limited Information in a Complex Environment , 1999, GECCO.

[15]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[16]  Sandip Sen,et al.  Strongly Typed Genetic Programming in Evolving Cooperation Strategies , 1995, ICGA.

[17]  Eugénio C. Oliveira,et al.  Advice-Exchange Between Evolutionary Algorithms and Reinforcement Learning Agents: Experiments in the Pursuit Domain , 2004, Adaptive Agents and Multi-Agent Systems.

[18]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[19]  M. Benda,et al.  On Optimal Cooperation of Knowledge Sources , 1985 .

[20]  Maja J. Mataric,et al.  Using communication to reduce locality in distributed multiagent learning , 1997, J. Exp. Theor. Artif. Intell..

[21]  Eugénio C. Oliveira,et al.  Advice-exchange in heterogeneous groups of learning agents , 2003, AAMAS '03.

[22]  Craig Boutilier,et al.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.

[23]  Jude W. Shavlik,et al.  Creating advice-taking reinforcement learners , 2004, Machine Learning.

[24]  Craig Boutilier,et al.  Imitation and Reinforcement Learning in Agents with Heterogeneous Actions , 2001, Canadian Conference on AI.

[25]  Steven D. Whitehead,et al.  A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning , 1991, AAAI.

[26]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.