Learning control Lyapunov function to ensure stability of dynamical system-based robot reaching motions

We consider an imitation learning approach to model robot point-to-point (also known as discrete or reaching) movements with a set of autonomous Dynamical Systems (DS). Each DS model codes a behavior (such as reaching for a cup and swinging a golf club) at the kinematic level. An estimate of these DS models are usually obtained from a set of demonstrations of the task. When modeling robot discrete motions with DS, ensuring stability of the learned DS is a key requirement to provide a useful policy. In this paper we propose an imitation learning approach that exploits the power of Control Lyapunov Function (CLF) control scheme to ensure global asymptotic stability of nonlinear DS. Given a set of demonstrations of a task, our approach proceeds in three steps: (1) Learning a valid Lyapunov function from the demonstrations by solving a constrained optimization problem, (2) Using one of the-state-of-the-art regression techniques to model an (unstable) estimate of the motion from the demonstrations, and (3) Using (1) to ensure stability of (2) during the task execution via solving a constrained convex optimization problem. The proposed approach allows learning a larger set of robot motions compared to existing methods that are based on quadratic Lyapunov functions. Additionally, by using the CLF formalism, the problem of ensuring stability of DS motions becomes independent from the choice of regression method. Hence it allows the user to adopt the most appropriate technique based on the requirements of the task at hand without compromising stability. We evaluate our approach both in simulation and on the 7 degrees of freedom Barrett WAM arm. Proposing a new parameterization to model complex Lyapunov functions.Estimating task-oriented Lyapunov functions from demonstrations.Ensuring stability of nonlinear autonomous dynamical systems.Applicability to any smooth regression method.

[1]  Aude Billard,et al.  BM: An iterative algorithm to learn stable non-linear dynamical systems with Gaussian mixture models , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Aude Billard,et al.  A dynamical system approach to realtime obstacle avoidance , 2012, Autonomous Robots.

[3]  Klas Kronander,et al.  Learning to control planar hitting motions in a minigolf-like task , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Marc H. Raibert,et al.  Legged Robots That Balance , 1986, IEEE Expert.

[5]  Eric L. Sauser,et al.  An Approach Based on Hidden Markov Model and Gaussian Mixture Regression , 2010 .

[6]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[7]  C. Lent On neuronal nihilism , 1980, Behavioral and Brain Sciences.

[8]  Stefan Schaal,et al.  Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[9]  Eduardo D. Sontag,et al.  Mathematical control theory: deterministic finite dimensional systems (2nd ed.) , 1998 .

[10]  Tobias Luksch,et al.  Adaptive movement sequences and predictive decisions based on hierarchical dynamical systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Z. Artstein Stabilization with relaxed controls , 1983 .

[12]  Christoph H. Lampert,et al.  Movement templates for learning of hitting and batting , 2010, 2010 IEEE International Conference on Robotics and Automation.

[13]  Jochen J. Steil,et al.  Using movement primitives in interpreting and decomposing complex trajectories in learning-by-doing , 2012, 2012 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[14]  Stefan Schaal,et al.  http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .

[15]  Jun Morimoto,et al.  Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives , 2010, IEEE Transactions on Robotics.

[16]  Seungsu Kim,et al.  Rapid and reactive robot control framework for catching objects in flight , 2014 .

[17]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[18]  Allen I. Selverston,et al.  Are central pattern generators understandable? , 1980, Behavioral and Brain Sciences.

[19]  Jennie Hall,et al.  DRAMA , 1912, Francis W. Parker School Yearbook.

[20]  S. Schaal Dynamic Movement Primitives -A Framework for Motor Control in Humans and Humanoid Robotics , 2006 .

[21]  Zhongping JIANG,et al.  Stabilization of nonlinear time-varying systems: a control lyapunov function approach , 2009, J. Syst. Sci. Complex..

[22]  Eduardo D. Sontag,et al.  Mathematical Control Theory: Deterministic Finite Dimensional Systems , 1990 .

[23]  Giulio Sandini,et al.  Imitation learning of non-linear point-to-point robot motions using dirichlet processes , 2012, 2012 IEEE International Conference on Robotics and Automation.

[24]  John Hallam,et al.  Evolution of a central pattern generator for the swimming and trotting gaits of the salamander , 1998 .

[25]  W. Wong,et al.  On ψ-Learning , 2003 .

[26]  Herbert Jaeger,et al.  Reservoir computing approaches to recurrent neural network training , 2009, Comput. Sci. Rev..

[27]  George M. Siouris,et al.  Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[28]  Aude Billard,et al.  Coupled dynamical system based arm-hand grasping model for learning fast adaptation strategies , 2012, Robotics Auton. Syst..

[29]  P. Olver Nonlinear Systems , 2013 .

[30]  Jochen J. Steil,et al.  Neural learning and dynamical selection of redundant solutions for inverse kinematic control , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[31]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[32]  Olivier Sigaud,et al.  Robot Skill Learning: From Reinforcement Learning to Evolution Strategies , 2013, Paladyn J. Behav. Robotics.

[33]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[34]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[35]  Aude Billard,et al.  DRAMA, a Connectionist Architecture for Control and Learning in Autonomous Robots , 1999, Adapt. Behav..

[36]  A. Opstal Dynamic Patterns: The Self-Organization of Brain and Behavior , 1995 .

[37]  Steven M. LaValle,et al.  The sampling-based neighborhood graph: an approach to computing and executing feedback motion strategies , 2004, IEEE Transactions on Robotics and Automation.

[38]  Darwin G. Caldwell,et al.  Evaluation of a probabilistic approach to learn and reproduce gestures by imitation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[39]  G. Ermentrout Dynamic patterns: The self-organization of brain and behavior , 1997 .

[40]  Darwin G. Caldwell,et al.  Encoding the time and space constraints of a task in explicit-duration Hidden Markov Model , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[41]  Ales Ude,et al.  Task adaptation through exploration and action sequencing , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[42]  Darwin G. Caldwell,et al.  Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  S. Grossberg,et al.  Adaptive vector integration to endpoint : self-organizing neural circuits for control of planned movement trajectories , 1992 .

[44]  Emanuel Todorov,et al.  First-exit model predictive control of fast discontinuous dynamics: Application to ball bouncing , 2011, 2011 IEEE International Conference on Robotics and Automation.

[45]  Aude Billard,et al.  Augmented-SVM: Automatic space partitioning for combining multiple non-linear dynamics , 2012, NIPS.

[46]  Darwin G. Caldwell,et al.  Bilateral physical interaction with a robot manipulator through a weighted combination of flow fields , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[47]  Seyed Mohammad,et al.  A Dynamical System-based Approach to Modeling Stable Robot Control Policies via Imitation Learning , 2012 .

[48]  O. Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[49]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[50]  Oliver Brock,et al.  Elastic Strips: A Framework for Integrated Planning and Execution , 1999, ISER.

[51]  Darwin G. Caldwell,et al.  Learning and Reproduction of Gestures by Imitation , 2010, IEEE Robotics & Automation Magazine.

[52]  D M Wolpert,et al.  Multiple paired forward and inverse models for motor control , 1998, Neural Networks.

[53]  Aude Billard,et al.  Learning Compliant Manipulation through Kinesthetic and Tactile Human-Robot Interaction , 2014, IEEE Transactions on Haptics.

[54]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[55]  Aude Billard,et al.  Learning to Play Minigolf: A Dynamical System-Based Approach , 2012, Adv. Robotics.

[56]  A. Ijspeert,et al.  Dynamic hebbian learning in adaptive frequency oscillators , 2006 .

[57]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[58]  Olivier Sigaud,et al.  Learning compact parameterized skills with a single regression , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[59]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[60]  Shinya Kotosaka,et al.  Submitted to: IEEE International Conference on Humanoid Robotics Nonlinear Dynamical Systems as Movement Primitives , 2022 .

[61]  Murat Arcak,et al.  Constructive nonlinear control: a historical perspective , 2001, Autom..

[62]  Aude Billard,et al.  Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[63]  Stephen Grossberg,et al.  The Vite Model: A Neural Command Circuit for Generating Arm and Articulator Trajectories, , 1988 .

[64]  Jun Morimoto,et al.  Learning from demonstration and adaptation of biped locomotion , 2004, Robotics Auton. Syst..

[65]  Amir F. Atiya,et al.  New results on recurrent network training: unifying the algorithms and accelerating convergence , 2000, IEEE Trans. Neural Networks Learn. Syst..

[66]  Steven M. LaValle,et al.  Planning algorithms , 2006 .