Transfer of Learned Knowledge in Life-Long Learning Agents

Previous work has demonstrated that the performance of mach ine learning algorithms can be improved by exploiting various forms of knowledge, su ch as domain theories. More recently, it has been recognized that some forms of knowledg e can in turn be learned – in particular, action models and task-specific internal repre sentations. Using learned knowledge as a source of learning improvement can be particularly appr opriate for agents that face many tasks. Over a long lifetime, an agent can amortize effort exp ended in learning knowledge by reducing the number of examples required to learn further ta sks. In developing such a “lifelong learning” agent, a number of research issues arise, inc luding: will an agent benefit from learned knowledge, can an agent exploit multiple sources of learned knowledge, how should the agent adapt as a new task arrives, how might the order of ta sk arrival impact learning, and how can such an agent be built? I propose that an agent can be constructed which learns knowl edge and exploits that knowledge to effectively improve further learning by reduc ing the numberof examples required to learn. I intend to study the transfer of learned knowledge by life-long learning agents within a neural network based architecture capable of increasing c apacity with the number of tasks faced. This proposal describes an appropriate architectur , based on preliminary work in controlled settings. This work has shown that learned knowl edge can reduce the number of examples required to learn novel tasks and that combining pr eviously separate mechanisms can yield a synergistic improvement on learning ability. It has al o explored how capacity can be expanded as new tasks arise over time and how the order in whic h tasks arise can be exploited with a graded curriculum. This preliminary work will be appl ied to a life-long learning agent and extended by carrying out experimental studies of a simul ated robot agent in a controlled environment and of a real-world mobile robot agent in Wean Ha ll. Thesis Committee: Tom Mitchell (Chair) Sebastian Thrun Manuela Veloso Jude Shavlik (University of Wisconsin)

[1]  Yann LeCun,et al.  Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.

[2]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[3]  Michael Hucka,et al.  Learning in Tele-autonomous Systems using Soar , 1989 .

[4]  Sebastian Thrun,et al.  Discovering Structure in Multiple Learning Tasks: The TC Algorithm , 1996, ICML.

[5]  Lorien Y. Pratt,et al.  Transferring previously learned back-propagation neural networks to new learning tasks , 1993 .

[6]  Jieyu Zhao,et al.  Simple Principles of Metalearning , 1996 .

[7]  Andrew W. Moore,et al.  Efficient memory-based learning for robot control , 1990 .

[8]  Larry A. Rendell,et al.  Layered Concept-Learning and Dynamically Variable Bias Management , 1987, IJCAI.

[9]  Rich Caruana,et al.  Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[10]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .

[11]  Pattie Maes,et al.  Incremental Self-Improvement for Life-Time Multi-Agent Reinforcement Learning , 1996 .

[12]  Tom M. Mitchell,et al.  Becoming Increasingly Reactive , 1990, AAAI.

[13]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[14]  Timothy W. Bickmore,et al.  A basic agent , 1990, Comput. Intell..

[15]  Reid G. Simmons,et al.  The curvature-velocity method for local obstacle avoidance , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[16]  Rich Caruana,et al.  Algorithms and Applications for Multitask Learning , 1996, ICML.

[17]  C A Nelson,et al.  Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[19]  Reid G. Simmons,et al.  Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.

[20]  Rodney A. Brooks,et al.  Real Robots, Real Learning Problems , 1993 .

[21]  Jude W. Shavlik,et al.  Creating Advice-Taking Reinforcement Learners , 1998, Machine Learning.

[22]  Sebastian Thrun,et al.  Lifelong Learning: A Case Study. , 1995 .

[23]  Nils J. Nilsson,et al.  Teleo-Reactive Programs for Agent Control , 1993, J. Artif. Intell. Res..

[24]  Marcus Frean,et al.  The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[25]  Mark Ring Two methods for hierarchy learning in reinforcement environments , 1993 .

[26]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[27]  Nicholas R. Jennings,et al.  Intelligent agents: theory and practice , 1995, The Knowledge Engineering Review.

[28]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[29]  Rosalind W. Picard,et al.  Interactive Learning Using a "Society of Models" , 2017, CVPR 1996.

[30]  Bartlett W. Mel SEEMORE: a view-based approach to 3-D object recognition using multiple visual cues , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[31]  Thomas G. Dietterich Limitations on Inductive Learning , 1989, ML.

[32]  Steven C. Suddarth,et al.  Symbolic-Neural Systems and the Use of Hints for Developing Complex Systems , 1991, Int. J. Man Mach. Stud..

[33]  D. Wolpert On Overfitting Avoidance as Bias , 1993 .

[34]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[35]  Lorien Y. Pratt,et al.  Experiments on the transfer of knowledge between neural networks , 1994, COLT 1994.

[36]  Tom M. Mitchell,et al.  Explanation-based learning for mobile-robot perception , 1997 .

[37]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[38]  Sebastian Thrun,et al.  Integrating Inductive Neural Network Learning and Explanation-Based Learning , 1993, IJCAI.

[39]  David Chapman,et al.  Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[40]  Juergen Schmidhuber,et al.  Incremental self-improvement for life-time multi-agent reinforcement learning , 1996 .

[41]  Eric Horvitz,et al.  Challenge problems for artificial intelligence , 1996, AAAI 1996.

[42]  Stefano Nolfi,et al.  Evolving non-Trivial Behaviors on Real Robots: an Autonomous Robot that Picks up Objects , 1995, AI*IA.

[43]  Sebastian Thrun,et al.  Explanation-based neural network learning a lifelong learning approach , 1995 .

[44]  John Moody,et al.  Prediction Risk and Architecture Selection for Neural Networks , 1994 .