论文信息 - Knowledge Transfer Using Local Features

Knowledge Transfer Using Local Features

We present a method for reducing the effort required to compute policies for tasks based on solutions to previously solved tasks. The key idea is to use a learned intermediate policy based on local features to create an initial policy for the new task. In order to further improve this initial policy, we developed a form of generalized policy iteration. We achieve a substantial reduction in computation needed to find policies when previous experience is available

M. Stolle | C.G. Atkeson | C. Atkeson | M. Stolle

[1] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.

[2] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..

[3] Allen Newell,et al. Chunking in Soar: The anatomy of a general learning mechanism , 1985, Machine Learning.

[4] Manuela M. Veloso,et al. An evolutionary approach to gait learning for four-legged robots , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[5] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.

[6] Daniel E. Koditschek,et al. Automated gait adaptation for legged robots , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[7] Sridhar Mahadevan,et al. Enhancing Transfer in Reinforcement Learning by Building Stochastic Models of Robot Actions , 1992, ML.

[8] Manuela M. Veloso,et al. Learning and using models of kicking motions for legged robots , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[9] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.

[10] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[11] Robert Givan,et al. Learning Domain-Specific Control Knowledge from Random Walks , 2004, ICAPS.

[12] Thomas Röfer,et al. Evolutionary Gait-Optimization Using a Fitness Function Based on Proprioception , 2004, RoboCup.

[13] Christopher G. Atkeson,et al. Learning from observation using primitives , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[14] Manuela Veloso. Learning by analogical reasoning in general problem-solving , 1992 .

[15] Balaraman Ravindran,et al. SMDP Homomorphisms: An Algebraic Approach to Abstraction in Semi-Markov Decision Processes , 2003, IJCAI.

[16] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.

[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.

[19] Shie Mannor,et al. Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.

[20] Gordon Cheng,et al. Learning Similar Tasks From Observation and Practice , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21] Manuela Veloso,et al. Automatically Acquiring Planning Templates from Example Plans , 2002 .

[22] Andrew G. Barto,et al. Autonomous discovery of temporal abstractions from interaction with an environment , 2002 .

[23] Glenn A. Iba,et al. A Heuristic Approach to the Discovery of Macro-Operators , 1989, Machine Learning.

[24] Scott Davies,et al. Multidimensional Triangulation and Interpolation for Reinforcement Learning , 1996, NIPS.