Data-Efficient Model Learning and Prediction for Contact-Rich Manipulation Tasks

In this letter, we investigate learning forward dynamics models and multi-step prediction of state variables (long-term prediction) for contact-rich manipulation. The problems are formulated in the context of model-based reinforcement learning (MBRL). We focus on two aspects–discontinuous dynamics and data-efficiency–both of which are important in the identified scope and pose significant challenges to State-of-the-Art methods. We contribute to closing this gap by proposing a method that explicitly adopts a specific hybrid structure for the model while leveraging the uncertainty representation and data-efficiency of Gaussian process. Our experiments on an illustrative moving block task and a 7-DOF robot demonstrate a clear advantage when compared to popular baselines in low data regimes.

[1]  Gérard Bloch,et al.  Piecewise smooth system identification in reproducing kernel Hilbert space , 2014, 53rd IEEE Conference on Decision and Control.

[2]  Ross A. Knepper,et al.  DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[3]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[4]  Jean-Baptiste Mouret,et al.  Black-box data-efficient policy search for robotics , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[6]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[7]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Carl E. Rasmussen,et al.  Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jan Bender,et al.  Constraint-based collision and contact handling using impulses , 2006 .

[10]  Siddhartha S. Srinivasa,et al.  Unsupervised Learning for Nonlinear PieceWise Smooth Hybrid Systems , 2017, ArXiv.

[11]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Uwe D. Hanebeck,et al.  Analytic moment-based Gaussian process filtering , 2009, ICML '09.

[14]  Ana Paiva,et al.  An ensemble inverse optimal control approach for robotic task learning and adaptation , 2019, Auton. Robots.

[15]  Jeffrey K. Uhlmann,et al.  New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[16]  Jan Peters,et al.  Learning inverse dynamics models with contacts , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Jan Lunze,et al.  Handbook of hybrid systems control : theory, tools, applications , 2009 .

[18]  Carl E. Rasmussen,et al.  Manifold Gaussian Processes for regression , 2014, 2016 International Joint Conference on Neural Networks (IJCNN).

[19]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[20]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Brian Charles Williams,et al.  Learning Hybrid Models with Guarded Transitions , 2015, AAAI.

[22]  René Vidal,et al.  Identification of Hybrid Systems: A Tutorial , 2007, Eur. J. Control.

[23]  Russ Tedrake,et al.  Erratum: Direct Trajectory Optimization of Rigid Body Dynamical Systems through Contact , 2012, WAFR.

[24]  Scott W. Linderman,et al.  Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems , 2017, AISTATS.

[25]  Mübeccel Demirekler,et al.  Analysis of single Gaussian approximation of Gaussian mixtures in Bayesian filtering applied to mixed multiple-model estimation , 2007, Int. J. Control.

[26]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[27]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[28]  Russ Tedrake,et al.  Direct Trajectory Optimization of Rigid Body Dynamical Systems through Contact , 2012, WAFR.

[29]  Dieter Fox,et al.  GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.