Probabilistic Model Learning and Long-term Prediction for Contact-rich Manipulation Tasks

Learning dynamics models is an essential component of model-based reinforcement learning. The learned model can be used for multi-step ahead predictions of the state variable, a process referred to as long-term prediction. Due to the recursive nature of the predictions, the accuracy has to be good enough to prevent significant error buildup. Accurate model learning in contact-rich manipulation is challenging due to the presence of varying dynamics regimes and discontinuities. Another challenge is the discontinuity in state evolution caused by impacting conditions. Building on the approach of representing contact dynamics by a system of switching models, we present a solution that also supports discontinuous state evolution. We evaluate our method on a contact-rich motion task, involving a 7-DOF industrial robot, using a trajectory-centric policy and show that it can effectively propagate state distributions through discontinuities.

[1]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[2]  Sergey Levine,et al.  Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Brian Charles Williams,et al.  Learning Hybrid Models with Guarded Transitions , 2015, AAAI.

[4]  Yoshua Bengio,et al.  An Input Output HMM Architecture , 1994, NIPS.

[5]  Jean-Baptiste Mouret,et al.  Black-box data-efficient policy search for robotics , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Scott W. Linderman,et al.  Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems , 2017, AISTATS.

[10]  Carl E. Rasmussen,et al.  Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[12]  René Vidal,et al.  Identification of Hybrid Systems: A Tutorial , 2007, Eur. J. Control.

[13]  Ross A. Knepper,et al.  DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[14]  Duy Nguyen-Tuong,et al.  Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.

[15]  Jan Lunze,et al.  Handbook of hybrid systems control : theory, tools, applications , 2009 .

[16]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[17]  Jan Peters,et al.  A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[18]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[19]  Uwe D. Hanebeck,et al.  Analytic moment-based Gaussian process filtering , 2009, ICML '09.

[20]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[21]  Ana Paiva,et al.  An ensemble inverse optimal control approach for robotic task learning and adaptation , 2019, Auton. Robots.

[22]  Christopher D. McKinnon,et al.  Learning multimodal models for robot dynamics online with a mixture of Gaussian process experts , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Jan Peters,et al.  Learning inverse dynamics models with contacts , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Alexis Boukouvalas,et al.  GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..

[25]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[26]  Dieter Fox,et al.  GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Siddhartha S. Srinivasa,et al.  Unsupervised Learning for Nonlinear PieceWise Smooth Hybrid Systems , 2017, ArXiv.

[28]  Carl E. Rasmussen,et al.  Manifold Gaussian Processes for regression , 2014, 2016 International Joint Conference on Neural Networks (IJCNN).

[29]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[30]  Marc Toussaint,et al.  Learning discontinuities with products-of-sigmoids for switching between local models , 2005, ICML.

[31]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[32]  Jeffrey K. Uhlmann,et al.  New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.