Deep Learning Tubes for Tube MPC

Learning-based control aims to construct models of a system to use for planning or trajectory optimization, e.g. in model-based reinforcement learning. In order to obtain guarantees of safety in this context, uncertainty must be accurately quantified. This uncertainty may come from errors in learning (due to a lack of data, for example), or may be inherent to the system. Propagating uncertainty forward in learned dynamics models is a difficult problem. In this work we use deep learning to obtain expressive and flexible models of how distributions of trajectories behave, which we then use for nonlinear Model Predictive Control (MPC). We introduce a deep quantile regression framework for control that enforces probabilistic quantile bounds and quantifies epistemic uncertainty. Using our method we explore three different approaches for learning tubes that contain the possible trajectories of the system, and demonstrate how to use each of them in a Tube MPC scheme. We prove these schemes are recursively feasible and satisfy constraints with a desired margin of probability. We present experiments in simulation on a nonlinear quadrotor system, demonstrating the practical efficacy of these ideas.

[1]  Soon-Jo Chung,et al.  Robust Regression for Safe Exploration in Control , 2019, L4DC.

[2]  Lavanya Marla,et al.  Monotonic Trends in Deep Neural Networks , 2019, ArXiv.

[3]  C. Rasmussen,et al.  Gaussian Process Priors with Uncertain Inputs - Application to Multiple-Step Ahead Time Series Forecasting , 2002, NIPS.

[4]  David D. Fan,et al.  Bayesian Learning-Based Adaptive Control for Safety Critical Systems , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Marc G. Bellemare,et al.  Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.

[6]  Moritz Diehl,et al.  Robust MPC via min-max differential inequalities , 2016, Autom..

[7]  R. Koenker,et al.  Regression Quantiles , 2007 .

[8]  Vítor Santos Costa,et al.  Inductive Logic Programming , 2013, Lecture Notes in Computer Science.

[9]  J. Maciejowski,et al.  Robust feasibility in model predictive control: necessary and sufficient conditions , 2001 .

[10]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[11]  Naomi S. Altman,et al.  Quantile regression , 2019, Nature Methods.

[12]  Alberto Bemporad,et al.  Robust model predictive control: A survey , 1998, Robustness in Identification and Control.

[13]  James M. Rehg,et al.  Information-Theoretic Model Predictive Control: Theory and Applications to Autonomous Driving , 2017, IEEE Transactions on Robotics.

[14]  Xuming He,et al.  Posterior Inference in Bayesian Quantile Regression with Asymmetric Laplace Likelihood , 2016 .

[15]  Marco Pavone,et al.  Robust online motion planning via contraction theory and convex optimization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[17]  Evangelos Theodorou,et al.  Differential Dynamic Programming for time-delayed systems , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[18]  F. Allgöwer,et al.  A robust adaptive model predictive control framework for nonlinear uncertain systems , 2019, International Journal of Robust and Nonlinear Control.

[19]  David Q. Mayne,et al.  Tube‐based robust nonlinear model predictive control , 2011 .

[20]  Maya R. Gupta,et al.  Deep Lattice Networks and Partial Monotonic Functions , 2017, NIPS.

[21]  Sriram Sankaranarayanan,et al.  Learning control lyapunov functions from counterexamples and demonstrations , 2018, Autonomous Robots.

[22]  James W. Taylor A Quantile Regression Approach to Estimating the Distribution of Multiperiod Returns , 1999 .

[23]  Lars Imsland,et al.  Nonlinear model predictive control with explicit back-offs for Gaussian process state space models , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[24]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[25]  Juraj Kabzan,et al.  Cautious Model Predictive Control Using Gaussian Process Regression , 2017, IEEE Transactions on Control Systems Technology.

[26]  H. Kozumi,et al.  Gibbs sampling methods for Bayesian quantile regression , 2011 .

[27]  Hans Joachim Ferreau,et al.  Efficient Numerical Methods for Nonlinear MPC and Moving Horizon Estimation , 2009 .

[28]  Francisco C. Pereira,et al.  Beyond Expectation: Deep Joint Mean and Quantile Regression for Spatiotemporal Problems. , 2018, IEEE transactions on neural networks and learning systems.

[29]  Daniel R. Smith,et al.  Evaluating Value-at-Risk Models via Quantile Regression , 2008 .

[30]  Torsten Koller,et al.  Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning , 2019, ArXiv.

[31]  Kim Peter Wabersich,et al.  Linear Model Predictive Safety Certification for Learning-Based Control , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[32]  Lukas Hewing,et al.  Stochastic Model Predictive Control for Linear Systems Using Probabilistic Reachable Sets , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[33]  Monimoy Bujarbaruah,et al.  Adaptive MPC under Time Varying Uncertainty: Robust and Stochastic , 2019, ArXiv.

[34]  Jonathan P. How,et al.  Dynamic Tube MPC for Nonlinear Systems , 2019, 2019 American Control Conference (ACC).

[35]  David Lopez-Paz,et al.  Single-Model Uncertainties for Deep Learning , 2018, NeurIPS.

[36]  H. Eric Tseng,et al.  A tube-based robust nonlinear predictive control approach to semiautonomous ground vehicles , 2014 .

[37]  Bernhard Schölkopf,et al.  Learning Inverse Dynamics: a Comparison , 2008, ESANN.

[38]  Lin Ma,et al.  Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning , 2018, NeurIPS.

[39]  Davide Scaramuzza,et al.  A General Framework for Uncertainty Estimation in Deep Learning , 2020, IEEE Robotics and Automation Letters.

[40]  S. Shankar Sastry,et al.  Provably safe and robust learning-based model predictive control , 2011, Autom..

[41]  Angela P. Schoellig,et al.  Learning-based nonlinear model predictive control to improve vision-based mobile robot path-tracking in challenging outdoor environments , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[42]  David Q. Mayne,et al.  Model predictive control: Recent developments and future promise , 2014, Autom..

[43]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[44]  Joseph Sill,et al.  Monotonic Networks , 1997, NIPS.

[45]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[46]  Lukas Hewing,et al.  Learning-Based Model Predictive Control: Toward Safe Learning in Control , 2020, Annu. Rev. Control. Robotics Auton. Syst..

[47]  Taeyoung Lee,et al.  Geometric tracking control of a quadrotor UAV on SE(3) , 2010, 49th IEEE Conference on Decision and Control (CDC).

[48]  Angela P. Schoellig,et al.  Robust Constrained Learning-based NMPC enabling reliable mobile robot path tracking , 2016, Int. J. Robotics Res..

[49]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[50]  David Q. Mayne,et al.  Robust model predictive control using tubes , 2004, Autom..

[51]  Pengcheng Zhou,et al.  Regression via Arbitrary Quantile Modeling , 2019, ArXiv.

[52]  Andreas Krause,et al.  Safe Model-based Reinforcement Learning with Stability Guarantees , 2017, NIPS.

[53]  Jonathan Sadeghi,et al.  Efficient training of interval Neural Networks for imprecise training data , 2019, Neural Networks.

[54]  C. Rasmussen,et al.  Improving PILCO with Bayesian Neural Network Dynamics Models , 2016 .