论文信息 - Transfer of Learned Knowledge in Life-Long Learning Agents

Transfer of Learned Knowledge in Life-Long Learning Agents

Previous work has demonstrated that the performance of mach ine learning algorithms can be improved by exploiting various forms of knowledge, su ch as domain theories. More recently, it has been recognized that some forms of knowledg e can in turn be learned – in particular, action models and task-specific internal repre sentations. Using learned knowledge as a source of learning improvement can be particularly appr opriate for agents that face many tasks. Over a long lifetime, an agent can amortize effort exp ended in learning knowledge by reducing the number of examples required to learn further ta sks. In developing such a “lifelong learning” agent, a number of research issues arise, inc luding: will an agent benefit from learned knowledge, can an agent exploit multiple sources of learned knowledge, how should the agent adapt as a new task arrives, how might the order of ta sk arrival impact learning, and how can such an agent be built? I propose that an agent can be constructed which learns knowl edge and exploits that knowledge to effectively improve further learning by reduc ing the numberof examples required to learn. I intend to study the transfer of learned knowledge by life-long learning agents within a neural network based architecture capable of increasing c apacity with the number of tasks faced. This proposal describes an appropriate architectur , based on preliminary work in controlled settings. This work has shown that learned knowl edge can reduce the number of examples required to learn novel tasks and that combining pr eviously separate mechanisms can yield a synergistic improvement on learning ability. It has al o explored how capacity can be expanded as new tasks arise over time and how the order in whic h tasks arise can be exploited with a graded curriculum. This preliminary work will be appl ied to a life-long learning agent and extended by carrying out experimental studies of a simul ated robot agent in a controlled environment and of a real-world mobile robot agent in Wean Ha ll. Thesis Committee: Tom Mitchell (Chair) Sebastian Thrun Manuela Veloso Jude Shavlik (University of Wisconsin)

Joseph O'Sullivan | Joseph O'Sullivan

[1] Yann LeCun,et al. Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.

[2] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[3] Michael Hucka,et al. Learning in Tele-autonomous Systems using Soar , 1989 .

[4] Sebastian Thrun,et al. Discovering Structure in Multiple Learning Tasks: The TC Algorithm , 1996, ICML.

[5] Lorien Y. Pratt,et al. Transferring previously learned back-propagation neural networks to new learning tasks , 1993 .

[6] Jieyu Zhao,et al. Simple Principles of Metalearning , 1996 .

[7] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .

[8] Larry A. Rendell,et al. Layered Concept-Learning and Dynamically Variable Bias Management , 1987, IJCAI.

[9] Rich Caruana,et al. Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[10] Dean A. Pomerleau,et al. Neural Network Perception for Mobile Robot Guidance , 1993 .

[11] Pattie Maes,et al. Incremental Self-Improvement for Life-Time Multi-Agent Reinforcement Learning , 1996 .