Active Learning in Motor Control

In this dissertation we consider how the performance of a learning control system can be improved with an efficient exploration strategy. We apply principled approaches from active learning to realise task-specific exploration with the LWPR on-line learning scheme. Our algorithm is based on the confidence in the model predictions and directs exploration to areas of high uncertainty. To the best of our knowledge, this is the first clear strategy for active learning in low-level motor control scenarios. Using two simulations, we show that learning the inverse dynamics of a movement system can benefit from an active data selection strategy. Both simulations – a Newtonian particle and a compliant two-joint robot arm – also provide an intuitive real-time visualisation of the LWPR confidence bounds and the explored space. The suggested algorithm is shown to be superior to simpler exploration schemes such as random flailing of the robot arm.

[1]  Andrew W. Moore,et al.  Knowledge of knowledge and intelligent experimentation for learning control , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[2]  Andrew James Smith,et al.  Dynamic generalisation of continuous action spaces in reinforcement learning : a neurally inspired approach , 2002 .

[3]  D. Sofge THE ROLE OF EXPLORATION IN LEARNING CONTROL , 1992 .

[4]  Stefan Schaal,et al.  Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning , 2002, Applied Intelligence.

[5]  Christopher G. Atkeson,et al.  Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.

[6]  Sebastian Thrun,et al.  On Planning And Exploration In Non-Discrete Environments , 1991 .

[7]  Daphne Koller,et al.  Active Learning for Parameter Estimation in Bayesian Networks , 2000, NIPS.

[8]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[9]  S. Schaal,et al.  Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.

[10]  Martina Hasenjäger,et al.  Active data selection in supervised and unsupervised learning , 2000 .

[11]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[12]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[13]  Peter Sollich,et al.  Query learning for maximum information gain in a multi-layer neural network , 1997 .

[14]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[15]  Stefan Schaal,et al.  Memory-based neural networks for robot learning , 1995, Neurocomputing.

[16]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[17]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[18]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[19]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[20]  David A. Cohn,et al.  Minimizing Statistical Bias with Queries , 1996, NIPS.

[21]  Masashi Sugiyama,et al.  Training Data Selection for Optimal Generalization with Noise Variance Reduction in , 1998 .

[22]  Jeff G. Schneider,et al.  Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.

[23]  Sethu Vijayakumar,et al.  Improving Generalization Ability through Active Learning , 1999 .

[24]  Sebastian Thrun,et al.  Exploration in active learning , 1998 .

[25]  Marco Wiering,et al.  Explorations in efficient reinforcement learning , 1999 .

[26]  Phillip J. McKerrow,et al.  Introduction to robotics , 1991 .

[27]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[28]  Andrew W. Moore,et al.  Efficient memory-based learning for robot control , 1990 .

[29]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.