A General Framework for Structured Learning of Mechanical Systems

Learning accurate dynamics models is necessary for optimal, compliant control of robotic systems. Current approaches to white-box modeling using analytic parameterizations, or black-box modeling using neural networks, can suffer from high bias or high variance. We address the need for a flexible, gray-box model of mechanical systems that can seamlessly incorporate prior knowledge where it is available, and train expressive function approximators where it is not. We propose to parameterize a mechanical system using neural networks to model its Lagrangian and the generalized forces that act on it. We test our method on a simulated, actuated double pendulum. We show that our method outperforms a naive, black-box model in terms of data-efficiency, as well as performance in model-based reinforcement learning. We also conduct a systematic study of our method's ability to incorporate available prior knowledge about the system to improve data efficiency.

[1]  B. Anderson,et al.  Optimal control: linear quadratic methods , 1990 .

[2]  Romain Laroche,et al.  Transfer Reinforcement Learning with Shared Dynamics , 2017, AAAI.

[3]  Daniel Liberzon,et al.  Calculus of Variations and Optimal Control Theory: A Concise Introduction , 2012 .

[4]  Scott Kuindersma,et al.  Variational Contact-Implicit Trajectory Optimization , 2017, ISRR.

[5]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[6]  Joelle Pineau,et al.  Decoupling Dynamics and Reward for Transfer Learning , 2018, ICLR.

[7]  Jan Peters,et al.  Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning , 2019, ICLR.

[8]  Carlos Canudas de Wit,et al.  Theory of Robot Control , 1996 .

[9]  Petros A. Ioannou,et al.  Robust Adaptive Control , 2012 .

[10]  G. Karniadakis,et al.  Multistep Neural Networks for Data-driven Discovery of Nonlinear Dynamical Systems , 2018, 1801.01236.

[11]  Stefan Schaal,et al.  Towards robust online inverse dynamics learning , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  C. Runge Ueber die numerische Auflösung von Differentialgleichungen , 1895 .

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Sergey Levine,et al.  Backprop KF: Learning Discriminative Deterministic State Estimators , 2016, NIPS.

[15]  Karl Johan Åström,et al.  BOOK REVIEW SYSTEM IDENTIFICATION , 1994, Econometric Theory.

[16]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[17]  RatliffNathan,et al.  DOOMED: Direct Online Optimization of Modeling Errors in Dynamics , 2016 .

[18]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[19]  Jan Peters,et al.  Using model knowledge for learning inverse dynamics , 2010, 2010 IEEE International Conference on Robotics and Automation.

[20]  Athanasios S. Polydoros,et al.  Survey of Model-Based Reinforcement Learning: Applications on Robotics , 2017, J. Intell. Robotic Syst..

[21]  Matthew Kelly,et al.  An Introduction to Trajectory Optimization: How to Do Your Own Direct Collocation , 2017, SIAM Rev..

[22]  Benjamin Recht,et al.  The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint , 2018, COLT.

[23]  Sami Haddadin,et al.  First-order-principles-based constructive network topologies: An application to robot inverse dynamics , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[24]  C. Atkeson,et al.  Estimation of inertial parameters of rigid body links of manipulators , 1985, 1985 24th IEEE Conference on Decision and Control.