Explorative learning of inverse models: A theoretical perspective

We investigate the role of redundancy for exploratory learning of inverse functions, where an agent learns to achieve goals by performing actions and observing outcomes. We present an analysis of linear redundancy and investigate goal-directed exploration approaches, which are empirically successful, but hardly theorized except negative results for special cases, and prove convergence to the optimal solution. We show that the learning curves of such processes are intrinsically low-dimensional and S-shaped, which explains previous empirical findings, and finally compare our results to non-linear domains.

[1]  Mitsuo Kawato,et al.  Feedback-Error-Learning Neural Network for Supervised Motor Learning , 1990 .

[2]  Terence D. Sanger,et al.  Failure of Motor Learning for Large Initial Errors , 2004, Neural Computation.

[3]  Jochen J. Steil,et al.  Bootstrapping inverse kinematics with Goal Babbling , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[4]  Stefan Schaal,et al.  Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.

[5]  Boyan Jovanovic,et al.  Entry, exit and diffusion with learning by doing , 1989 .

[6]  R. Johansson,et al.  Eye–Hand Coordination during Learning of a Novel Visuomotor Task , 2005, The Journal of Neuroscience.

[7]  Pierre-Yves Oudeyer,et al.  Bootstrapping intrinsically motivated learning with human demonstration , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[8]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[9]  Jochen J. Steil,et al.  Goal Babbling Permits Direct Learning of Inverse Kinematics , 2010, IEEE Transactions on Autonomous Mental Development.

[10]  Pierre-Yves Oudeyer,et al.  Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Gaston H. Gonnet,et al.  On the LambertW function , 1996, Adv. Comput. Math..

[12]  Kenneth Kreutz-Delgado,et al.  Learning Global Direct Inverse Kinematics , 1991, NIPS.

[13]  Bartlett W. Mel A Connectionist Model May Shed Light on Neural Mechanisms for Visually Guided Reaching , 1991, Journal of Cognitive Neuroscience.

[14]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[15]  Jochen J. Steil,et al.  Efficient exploration and learning of whole body kinematics , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[16]  Susumu Tachi,et al.  Goal-directed property of online direct inverse modeling , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[17]  Jochen J. Steil,et al.  Online Goal Babbling for rapid bootstrapping of inverse models in high dimensions , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).

[18]  Herbert W. Hethcote,et al.  The Mathematics of Infectious Diseases , 2000, SIAM Rev..

[19]  Mitsuo Kawato,et al.  Multiple Paired Forward-Inverse Models for Human Motor Learning and Control , 1998, NIPS.

[20]  Stefan Schaal,et al.  Learning to Control in Operational Space , 2008, Int. J. Robotics Res..

[21]  Javier R. Movellan,et al.  Learning to Make Facial Expressions , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[22]  Stefan Schaal,et al.  Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[23]  Edward L. Reiss,et al.  A new asymptotic method for jump phenomena , 1980 .