论文信息 - Data-Efficient Model Learning and Prediction for Contact-Rich Manipulation Tasks

Data-Efficient Model Learning and Prediction for Contact-Rich Manipulation Tasks

In this letter, we investigate learning forward dynamics models and multi-step prediction of state variables (long-term prediction) for contact-rich manipulation. The problems are formulated in the context of model-based reinforcement learning (MBRL). We focus on two aspects–discontinuous dynamics and data-efficiency–both of which are important in the identified scope and pose significant challenges to State-of-the-Art methods. We contribute to closing this gap by proposing a method that explicitly adopts a specific hybrid structure for the model while leveraging the uncertainty representation and data-efficiency of Gaussian process. Our experiments on an illustrative moving block task and a 7-DOF robot demonstrate a clear advantage when compared to popular baselines in low data regimes.

[1] Gérard Bloch,et al. Piecewise smooth system identification in reproducing kernel Hilbert space , 2014, 53rd IEEE Conference on Decision and Control.

[2] Ross A. Knepper,et al. DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[3] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[4] Jean-Baptiste Mouret,et al. Black-box data-efficient policy search for robotics , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[6] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[7] Nolan Wagener,et al. Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[8] Carl E. Rasmussen,et al. Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Jan Bender,et al. Constraint-based collision and contact handling using impulses , 2006 .

[10] Siddhartha S. Srinivasa,et al. Unsupervised Learning for Nonlinear PieceWise Smooth Hybrid Systems , 2017, ArXiv.

[11] Michael I. Jordan,et al. Variational inference for Dirichlet process mixtures , 2006 .

[12] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13] Uwe D. Hanebeck,et al. Analytic moment-based Gaussian process filtering , 2009, ICML '09.

[14] Ana Paiva,et al. An ensemble inverse optimal control approach for robotic task learning and adaptation , 2019, Auton. Robots.

[15] Jeffrey K. Uhlmann,et al. New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[16] Jan Peters,et al. Learning inverse dynamics models with contacts , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17] Jan Lunze,et al. Handbook of hybrid systems control : theory, tools, applications , 2009 .

[18] Carl E. Rasmussen,et al. Manifold Gaussian Processes for regression , 2014, 2016 International Joint Conference on Neural Networks (IJCNN).

[19] M. Escobar,et al. Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[20] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21] Brian Charles Williams,et al. Learning Hybrid Models with Guarded Transitions , 2015, AAAI.

[22] René Vidal,et al. Identification of Hybrid Systems: A Tutorial , 2007, Eur. J. Control.

[23] Russ Tedrake,et al. Erratum: Direct Trajectory Optimization of Rigid Body Dynamical Systems through Contact , 2012, WAFR.

[24] Scott W. Linderman,et al. Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems , 2017, AISTATS.

[25] Mübeccel Demirekler,et al. Analysis of single Gaussian approximation of Gaussian mixtures in Bayesian filtering applied to mixed multiple-model estimation , 2007, Int. J. Control.

[26] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[27] Jan Peters,et al. Model learning for robot control: a survey , 2011, Cognitive Processing.

[28] Russ Tedrake,et al. Direct Trajectory Optimization of Rigid Body Dynamical Systems through Contact , 2012, WAFR.

[29] Dieter Fox,et al. GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.