Coaching Advice and Adaptation

Our research on coaching refers to one autonomous agent providing advice to another autonomous agent about how to act. In past work, we dealt with advice-receiving agents with fixed strategies, and we now consider agents which are learning. Further, we consider agents which have various limitations, with the hypothesis that if the coach adapts its advice to those limitations, more effective learning will result. In this work, we systematically explore the effect of various limitations upon the effectiveness of the coach’s advice. We state the two learning problems faced by the coach and the coached agents, and empirically study these problems in a predator-prey environment. The coach has access to optimal policies for the environment, and advises the predator on which actions to take. We experiment with limitations on the predator agent’s actions, the bandwidth between the coach and agent, and the memory size of the agent. We analyze the results which show that coaching can improve agent performance in the face of all these limitations.

[1]  Peter Bakker,et al.  Robot see, robot do: An overview of robot imitation , 1996 .

[2]  S. King Learning to fly. , 1998, Nursing times.

[3]  Patrick Riley MPADES: Middleware for Parallel Agent Discrete Event Simulation , 2002, RoboCup.

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[6]  Garrison W. Cottrell,et al.  A Connectionist Model Of Instruction Following , 1995 .

[7]  Craig Boutilier,et al.  Implicit Imitation in Multiagent Reinforcement Learning , 1999, ICML.

[8]  Manuela M. Veloso,et al.  Planning for Distributed Execution through Use of Probabilistic Opponent Models , 2002, AIPS.

[9]  Manuela Veloso,et al.  An Empirical Study of Coaching , 2002, DARS.

[10]  Tamio Arai,et al.  Distributed Autonomous Robotic Systems 3 , 1998 .

[11]  Craig Boutilier,et al.  Imitation and Reinforcement Learning in Agents with Heterogeneous Actions , 2001, Canadian Conference on AI.

[12]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[13]  John E. Laird,et al.  Flexibly Instructable Agents , 1995, J. Artif. Intell. Res..

[14]  Paul E. Utgoff,et al.  Two Kinds of Training Information For Evaluation Function Learning , 1991, AAAI.

[15]  Kerstin Dautenhahn,et al.  Getting to know each other - Artificial social intelligence for autonomous robots , 1995, Robotics Auton. Syst..

[16]  Paul E. Utgoff,et al.  A Teaching Method for Reinforcement Learning , 1992, ML.

[17]  Jude W. Shavlik,et al.  Creating Advice-Taking Reinforcement Learners , 1998, Machine Learning.

[18]  John Yen,et al.  Training Teams with Collaborative Agents , 2000, Intelligent Tutoring Systems.

[19]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[20]  Ivan Bratko,et al.  Skill Reconstruction as Induction of LQ Controllers with Subgoals , 1997, IJCAI.

[21]  Manuela M. Veloso,et al.  Towards any-team coaching in adversarial domains , 2002, AAMAS '02.

[22]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[23]  Milind Tambe,et al.  Automated assistants to aid humans in understanding team behaviors , 2000, AGENTS '00.

[24]  Daniel D. Suthers,et al.  Automated Advice-Giving Strategies for Scientific Inquiry , 1996, Intelligent Tutoring Systems.

[25]  Martha C. Polson,et al.  Foundations of intelligent tutoring systems , 1988 .

[26]  H. Friedrich,et al.  In: Probramming by Demonstration vs. Learning from Examples Workshop at Ml'95 Obtaining Good Performance from a Bad Teacher , 1995 .