A reservoir computing approach for learning forward dynamics of industrial manipulators

Many robot learning algorithms depend on a model of the robot's forward dynamics for simulating potential trajectories and ultimately learning a required task. In this paper, we present a data-driven reservoir computing approach and apply it for learning forward dynamics models. Our proposed machine learning algorithm exploits the concepts of dynamic reservoir, self-organized learning and Bayesian inference. We have evaluated our approach on datasets gathered from two industrial robotic manipulators and compared it on both step-by-step and multi-step trajectory prediction scenarios with state-of-the-art algorithms. The evaluation considers the algorithms' convergence and prediction performance on joint and operational space for varying prediction horizons, as well as computational time. Results show that the proposed algorithm performs better than the state-of-the-art, converges fast and can achieve accurate predictions over longer horizons, which makes it a reliable, data-efficient approach for learning forward models.

[1]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[2]  Athanasios S. Polydoros,et al.  Real-time deep learning of robotic manipulator inverse dynamics , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Jeff G. Schneider,et al.  Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[4]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[5]  Benjamin Schrauwen,et al.  Training and Analysing Deep Recurrent Neural Networks , 2013, NIPS.

[6]  Yuval Tassa,et al.  Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[7]  Olivier Sigaud,et al.  On-line regression algorithms for learning mechanical models of robots: A survey , 2011, Robotics Auton. Syst..

[8]  Mantas Lukosevicius,et al.  A Practical Guide to Applying Echo State Networks , 2012, Neural Networks: Tricks of the Trade.

[9]  Jeff G. Schneider,et al.  Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.

[10]  D M Wolpert,et al.  Multiple paired forward and inverse models for motor control , 1998, Neural Networks.

[11]  Herbert Jaeger,et al.  Adaptive Nonlinear System Identification with Echo State Networks , 2002, NIPS.

[12]  Peter Englert,et al.  Multi-task policy search for robotics , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Martial Hebert,et al.  Improving Multi-Step Prediction of Learned Time Series Models , 2015, AAAI.

[14]  Richard D. Braatz,et al.  On the "Identification and control of dynamical systems using neural networks" , 1997, IEEE Trans. Neural Networks.

[15]  Jun Nakanishi,et al.  Operational Space Control: A Theoretical and Empirical Comparison , 2008, Int. J. Robotics Res..

[16]  David E. Orin,et al.  Robot dynamics: equations and algorithms , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[17]  Shimon Whiteson,et al.  Neuroevolutionary reinforcement learning for generalized helicopter control , 2009, GECCO.

[18]  Mitsuo Kawato,et al.  A neural network model for arm trajectory formation using forward and inverse dynamics models , 1993, Neural Networks.

[19]  Christopher G. Atkeson,et al.  Nonparametric Model-Based Reinforcement Learning , 1997, NIPS.

[20]  Yuichi Nakamura,et al.  Approximation of dynamical systems by continuous time recurrent neural networks , 1993, Neural Networks.

[21]  Ai Poh Loh,et al.  Model-based contextual policy search for data-efficient generalization of robot skills , 2017, Artif. Intell..

[22]  Marc Carreras,et al.  Policy gradient based Reinforcement Learning for real autonomous underwater cable tracking , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  J. Andrew Bagnell,et al.  Agnostic System Identification for Model-Based Reinforcement Learning , 2012, ICML.

[24]  Darwin G. Caldwell,et al.  Reinforcement Learning in Robotics: Applications and Real-World Challenges , 2013, Robotics.

[25]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[26]  Amaury Lendasse,et al.  Methodology for long-term prediction of time series , 2007, Neurocomputing.

[27]  Gary Boone,et al.  Efficient reinforcement learning: model-based Acrobot control , 1997, Proceedings of International Conference on Robotics and Automation.