ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions

Sample-based learning model predictive control (LMPC) strategies have recently attracted attention due to their desirable theoretical properties and their good empirical performance on robotic tasks. However, prior analysis of LMPC controllers for stochastic systems has mainly focused on linear systems in the iterative learning control setting. We present a novel LMPC algorithm, Adjustable Boundary Condition LMPC (ABC-LMPC), which enables rapid adaptation to novel start and goal configurations and theoretically show that the resulting controller guarantees iterative improvement in expectation for stochastic nonlinear systems. We present results with a practical instantiation of this algorithm and experimentally demonstrate that the resulting controller adapts to a variety of initial and terminal conditions on 3 stochastic continuous control tasks.

[1]  Alexander Liniger,et al.  Cautious NMPC with Gaussian Process Dynamics for Autonomous Miniature Race Cars , 2018 .

[2]  Sham M. Kakade,et al.  Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.

[3]  S. Levine,et al.  Safety Augmented Value Estimation From Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks , 2019, IEEE Robotics and Automation Letters.

[4]  Atsushi Sakai,et al.  PythonRobotics: a Python code collection of robotics algorithms , 2018, ArXiv.

[5]  Franco Blanchini,et al.  Relatively optimal control and its linear implementation , 2003, IEEE Trans. Autom. Control..

[6]  Juraj Kabzan,et al.  Cautious Model Predictive Control Using Gaussian Process Regression , 2017, IEEE Transactions on Control Systems Technology.

[7]  Torsten Koller,et al.  Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning , 2019, ArXiv.

[8]  Kim Peter Wabersich,et al.  Linear Model Predictive Safety Certification for Learning-Based Control , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[9]  Frank Allgöwer,et al.  Stabilizing linear model predictive control: On the enlargement of the terminal set , 2013, 2013 European Control Conference (ECC).

[10]  J.F.A.K. van Benthem,et al.  Man Muss Immer Umkehren , 2008 .

[11]  Francesco Borrelli,et al.  Sample-Based Learning Model Predictive Control for Linear Uncertain Systems , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[12]  Xiaojing Zhang,et al.  Robust learning model predictive control for iterative tasks: Learning from experience , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[13]  Tom Schaul,et al.  Universal Value Function Approximators , 2015, ICML.

[14]  Lorenzo Fagiano,et al.  Learning-based predictive control for linear systems: a unitary approach , 2018, Autom..

[15]  Sergey Levine,et al.  Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.

[16]  Dirk P. Kroese,et al.  Chapter 3 – The Cross-Entropy Method for Optimization , 2013 .

[17]  Marko Bacic,et al.  General interpolation in MPC and its advantages , 2003, IEEE Trans. Autom. Control..

[18]  S. Shankar Sastry,et al.  Provably safe and robust learning-based model predictive control , 2011, Autom..

[19]  Peter Stone,et al.  Learning Curriculum Policies for Reinforcement Learning , 2018, AAMAS.

[20]  David Hsu,et al.  Motion planning under uncertainty for robotic tasks with long time horizons , 2011, Int. J. Robotics Res..

[21]  Brijen Thananjeyan,et al.  On-Policy Robot Imitation Learning from a Converging Supervisor , 2019, CoRL.

[22]  J. Schulman,et al.  Gaussian Belief Space Planning for Imprecise Articulated Robots , 2013 .

[23]  Francesco Borrelli,et al.  Learning How to Autonomously Race a Car: A Predictive Control Approach , 2019, IEEE Transactions on Control Systems Technology.

[24]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[25]  Pieter Abbeel,et al.  Sigma hulls for Gaussian belief space planning for imprecise articulated robots amid obstacles , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Joan Bruna,et al.  Backplay: "Man muss immer umkehren" , 2018, ArXiv.

[27]  Mo Chen,et al.  BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[28]  Pieter Abbeel,et al.  Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[29]  Francesco Borrelli,et al.  Learning Model Predictive Control for Iterative Tasks. A Data-Driven Control Framework , 2016, IEEE Transactions on Automatic Control.

[30]  J. Kocijan,et al.  Gaussian process model based predictive control , 2004, Proceedings of the 2004 American Control Conference.

[31]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[32]  Sergey Levine,et al.  Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.

[33]  P. Abbeel,et al.  LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information , 2011 .

[34]  Sergey Levine,et al.  Deep Dynamics Models for Learning Dexterous Manipulation , 2019, CoRL.