Learning to Control in Operational Space

One of the most general frameworks for phrasing control problems for complex, redundant robots is operational-space control. However, while this framework is of essential importance for robotics and well understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in the face of modeling errors, which are inevitable in complex robots (e.g. humanoid robots). In this paper, we suggest a learning approach for operational-space control as a direct inverse model learning problem. A first important insight for this paper is that a physically correct solution to the inverse problem with redundant degrees of freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component of our work is based on the insight that many operational-space controllers can be understood in terms of a constrained optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational-space controller. From the machine learning point of view, this learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees-of-freedom robot arm are used to illustrate the suggested approach. The application to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on a real, physical Mitsubishi PA-10 medical robotics arm.

[1]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[2]  S. Sastry,et al.  Dynamic control of redundant manipulators , 1988, Proceedings. 1988 IEEE International Conference on Robotics and Automation.

[3]  A. Guez,et al.  Solution to the inverse kinematics problem in robotics by neural networks , 1988, IEEE 1988 International Conference on Neural Networks.

[4]  Yoshihiko Nakamura,et al.  Advanced robotics - redundancy and optimization , 1990 .

[5]  Alessandro De Luca,et al.  Learning control for redundant manipulators , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[6]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[7]  Keith L. Doty,et al.  A Theory of Generalized Inverses Applied to Robotics , 1993, Int. J. Robotics Res..

[8]  S. Grossberg,et al.  A Self-Organizing Neural Model of Motor Equivalent Reaching and Tool Use by a Multijoint Arm , 1993, Journal of Cognitive Neuroscience.

[9]  Bruno Siciliano,et al.  Modeling and Control of Robot Manipulators , 1995 .

[10]  R. Kalaba,et al.  Analytical Dynamics: A New Approach , 1996 .

[11]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[12]  Geoffrey E. Hinton,et al.  Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.

[13]  Mitsuo Kawato,et al.  Multiple Paired Forward-Inverse Models for Human Motor Learning and Control , 1998, NIPS.

[14]  Jonghoon Park,et al.  On dynamical decoupling of kinematically redundant manipulators , 1999, Proceedings 1999 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human and Environment Friendly Robots with High Intelligence and Emotional Quotients (Cat. No.99CH36289).

[15]  Oussama Khatib,et al.  Gauss' principle and the dynamics of redundant and constrained manipulators , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[16]  Stefan Schaal,et al.  Inverse kinematics for humanoid robots , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[17]  L. Siciliano Modelling and Control of Robot Manipulators , 2000 .

[18]  Stefan Schaal,et al.  Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[19]  Stefan Schaal,et al.  Statistical Learning for Humanoid Robots , 2002, Auton. Robots.

[20]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[21]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[22]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[23]  Stefan Schaal,et al.  Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning , 2002, Applied Intelligence.

[24]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[25]  Jun Nakanishi,et al.  Learning composite adaptive control for a class of nonlinear systems , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[26]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[27]  Jun Nakanishi,et al.  A unifying methodology for the control of robotic systems , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Oussama Khatib,et al.  Control of Free-Floating Humanoid Robots Through Task Prioritization , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[29]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[30]  Jun Nakanishi,et al.  Comparative experiments on task space control with redundancy resolution , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  M. Spong,et al.  Robot Modeling and Control , 2005 .

[32]  J.P. Desai,et al.  Modeling and control of the Mitsubishi PA-10 robot arm harmonic drive system , 2005, IEEE/ASME Transactions on Mechatronics.

[33]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[34]  Jun Nakanishi,et al.  A Bayesian Approach to Nonlinear Parameter Identification for Rigid Body Dynamics , 2006, Robotics: Science and Systems.

[35]  J. Farrell,et al.  Adaptive Approximation Based Control: Unifying Neural, Fuzzy and Traditional Adaptive Approximation Approaches (Adaptive and Learning Systems for Signal Processing, Communications and Control Series) , 2006 .

[36]  J. Farrell,et al.  Adaptive Approximation Based Control: General Theory , 2006 .

[37]  Stefan Schaal,et al.  Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.

[38]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[39]  Jun Nakanishi,et al.  Experimental Evaluation of Task Space Position/Orientation Control Towards Compliant Control for Humanoid Robots , 2007 .

[40]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[41]  A. Forbes Modeling and control , 1990, Journal of Clinical Monitoring.

[42]  Jan Peters,et al.  Real-time learning of resolved velocity control on a Mitsubishi PA-10 , 2008, 2008 IEEE International Conference on Robotics and Automation.

[43]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[44]  Edward Grant,et al.  Learning Control , 1993, Encyclopedia of Machine Learning.