Implementation of Imitation Learning using Natural Learner Central Pattern Generator Neural Networks

In this paper a new design of neural networks is introduced, which is able to generate oscillatory patterns. The fundamental building block of the neural network is O-neurons that can generate an oscillation in its transfer functions. Since the natural policy gradient learning has been used in training a central pattern generator paradigm, it is called Natural Learner CPG Neural Networks (NLCPGNN). O-neurons are connected and coupled to each other in order to shape a network and their unknown parameters are found by a natural policy gradient learning algorithm. The main contribution of this paper is design of this learning algorithm which is able to simultaneously search for the weights and topology of the network. This system is capable to obtain any complex motion and rhythmic trajectory via first layer and learn rhythmic trajectories in the second layer and converge towards all these movements. Moreover this two layers system is able to provide various features of a learner model for instance resistance against perturbations, modulation of trajectories amplitude and frequency. Simulation of the learning system in the robot simulator (WEBOTS) that is linked with MATLAB software has been done. Implementation on a real NAO robot demonstrates that the robot has learned desired motion with high accuracy. These results show proposed system produces high convergence rate and low test errors.

[1]  Jun Morimoto,et al.  Learning CPG-based biped locomotion with a policy gradient method , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[2]  Andrej Gams,et al.  On-line learning and modulation of periodic movements with nonlinear dynamical systems , 2009, Auton. Robots.

[3]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Kamal Jamshidi,et al.  Training oscillatory neural networks using natural gradient particle swarm optimization , 2014, Robotica.

[5]  Jean-Christophe Baillie,et al.  RobotStadium: Online Humanoid Robot Soccer Simulation Competition , 2009, RoboCup.

[6]  Shalabh Bhatnagar,et al.  Natural actor-critic algorithms , 2009, Autom..

[7]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[8]  Teresa Zielińska,et al.  Biological inspiration used for robots motion synthesis , 2009, Journal of Physiology-Paris.

[9]  E. Izhikevich Weakly Coupled Oscillators , 2006 .

[10]  Ludovic Righetti,et al.  Toward simple control for complex, autonomous robotic applications: combining discrete and rhythmic motor primitives , 2011, Auton. Robots.

[11]  Ludovic Righetti,et al.  Programmable central pattern generators: an application to biped locomotion control , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[12]  Auke Jan Ijspeert,et al.  Central pattern generators for locomotion control in animals and robots: A review , 2008, Neural Networks.

[13]  A. Ijspeert,et al.  Dynamic hebbian learning in adaptive frequency oscillators , 2006 .

[14]  F. Hackenberger Balancing Central Pattern Generator based Humanoid Robot Gait using Reinforcement Learning , 2007 .