Constrained Physics-Informed Deep Learning for Stable System Identification and Control of Unknown Linear Systems

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. This paper presents a novel data-driven method for learning deep constrained continuous control policies and dynamical models of linear systems. By leveraging partial knowledge of system dynamics and constraint enforcing multi-objective loss functions, the method can learn from small and static datasets, handle time-varying state and input constraints and enforce the stability properties of the controlled system. We use a continuous control design example to demonstrate the performance of the method on three distinct tasks: system identification, control policy learning, and simultaneous system identification and policy learning. We assess the system identification performance by comparing open-loop simulations of the true system and the learned models. We demonstrate the performance of the policy learning methodology in closed-loop simulations using the system model affected by varying levels of parametric and additive uncertainties. We report superior performance in terms of reference tracking, robustness, and online computational and memory footprints compared with classical control approaches, namely LQR and LQI controllers, and with three variants of model predictive control (MPC) formulations and two traditional MPC solution approaches. We then evaluate the potential of simultaneously learning the system model and control policy. Our empirical results demonstrate the effectiveness of our unifying framework for constrained optimal control of linear systems to provide stability guarantees of the learned dynamics, robustness to uncertainty, and high sampling efficiency.

[1]  Lukas Hewing,et al.  Learning-Based Model Predictive Control: Toward Safe Learning in Control , 2020, Annu. Rev. Control. Robotics Auton. Syst..

[2]  Jure Leskovec,et al.  Learning to Simulate Complex Physics with Graph Networks , 2020, ICML.

[3]  J. Zico Kolter,et al.  Learning Stable Deep Dynamics Models , 2020, NeurIPS.

[4]  Monimoy Bujarbaruah,et al.  Near-Optimal Rapid MPC Using Neural Networks: A Primal-Dual Policy Learning Framework , 2019, IEEE Transactions on Control Systems Technology.

[5]  Jimmy Ba,et al.  Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.

[6]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[7]  Stephen P. Boyd,et al.  Differentiable Convex Optimization Layers , 2019, NeurIPS.

[8]  Steven W. Chen,et al.  Large Scale Model Predictive Control with Neural Networks and Primal Active Sets , 2019, Autom..

[9]  Kyle Cranmer,et al.  Hamiltonian Graph Networks with ODE Integrators , 2019, ArXiv.

[10]  Zhiping Mao,et al.  DeepXDE: A Deep Learning Library for Solving Differential Equations , 2019, AAAI Spring Symposium: MLPS.

[11]  Jan Peters,et al.  Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning , 2019, ICLR.

[12]  Francesco Borrelli,et al.  Safe and Near-Optimal Policy Learning for Model Predictive Control using Primal-Dual Neural Networks , 2019, 2019 American Control Conference (ACC).

[13]  Jason Yosinski,et al.  Hamiltonian Neural Networks , 2019, NeurIPS.

[14]  Steven L. Brunton,et al.  Deep Model Predictive Control with Online Learning for Complex Physical Systems , 2019, ArXiv.

[15]  Gabriel Dulac-Arnold,et al.  Challenges of Real-World Reinforcement Learning , 2019, ArXiv.

[16]  Eran Treister,et al.  IMEXnet: A Forward Stable Deep Neural Network , 2019, ICML.

[17]  Dario Izzo,et al.  On the stability analysis of optimal state feedbacks as represented by deep neural models , 2018, ArXiv.

[18]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[19]  David Duvenaud,et al.  Invertible Residual Networks , 2018, ICML.

[20]  Byron Boots,et al.  Differentiable MPC for End-to-end Planning and Control , 2018, NeurIPS.

[21]  Todd D. Murphey,et al.  Structured Neural Network Dynamics for Model-based Control , 2018, ArXiv.

[22]  Andreas Krause,et al.  The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems , 2018, CoRL.

[23]  Benjamin Karg,et al.  Efficient Representation and Approximation of Model Predictive Control Laws via Deep Learning , 2018, IEEE Transactions on Cybernetics.

[24]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[25]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[26]  Frank Allgöwer,et al.  Learning an Approximate Model Predictive Controller With Guarantees , 2018, IEEE Control Systems Letters.

[27]  Yuanyuan Shi,et al.  Optimal Control Via Neural Networks: A Convex Approach , 2018, ICLR.

[28]  Damien Picard,et al.  Approximate model predictive building control via machine learning , 2018 .

[29]  Xiaojing Zhang,et al.  Adaptive MPC for Iterative Tasks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[30]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[31]  Yu Wang,et al.  A new concept using LSTM Neural Networks for dynamic system identification , 2017, 2017 American Control Conference (ACC).

[32]  Eldad Haber,et al.  Stable architectures for deep neural networks , 2017, ArXiv.

[33]  Jonas Degrave,et al.  A DIFFERENTIABLE PHYSICS ENGINE FOR DEEP LEARNING IN ROBOTICS , 2016, Front. Neurorobot..

[34]  Wojciech Zaremba,et al.  Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.

[35]  Eiji Konaka,et al.  Model Predictive Control implementation on neural networks using denoising autoencoder , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[36]  Uri Shalit,et al.  Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.

[37]  Lei Xu,et al.  Input Convex Neural Networks : Supplementary Material , 2017 .

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Uri Shalit,et al.  Deep Kalman Filters , 2015, ArXiv.

[40]  Yuval Tassa,et al.  Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[41]  Sergey Levine,et al.  Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[43]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[44]  Emanuel Todorov,et al.  Combining the benefits of function approximation and trajectory optimization , 2014, Robotics: Science and Systems.

[45]  Xiaojing Zhang,et al.  Practical Comparison of Optimization Algorithms for Learning-Based MPC with Linear Models , 2014, ArXiv.

[46]  Miroslav Fikar,et al.  Explicit stochastic MPC approach to building temperature control , 2013, 52nd IEEE Conference on Decision and Control.

[47]  Manfred Morari,et al.  Multi-Parametric Toolbox 3.0 , 2013, 2013 European Control Conference (ECC).

[48]  S. Shankar Sastry,et al.  Provably safe and robust learning-based model predictive control , 2011, Autom..

[49]  Jeen-Shing Wang,et al.  A Hammerstein-Wiener recurrent neural network with universal approximation capability , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[50]  David Q. Mayne,et al.  Reachability analysis of discrete-time systems with disturbances , 2006, IEEE Transactions on Automatic Control.

[51]  M. Morari,et al.  Move blocking strategies in receding horizon control , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[52]  Adrian Wills,et al.  Barrier function based model predictive control , 2004, Autom..

[53]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[54]  Jan M. Maciejowski,et al.  Predictive control : with constraints , 2002 .

[55]  T. Johansen,et al.  An algorithm for multi-parametric quadratic programming and explicit MPC solutions , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).

[56]  Arkadi Nemirovski,et al.  Lectures on modern convex optimization - analysis, algorithms, and engineering applications , 2001, MPS-SIAM series on optimization.

[57]  Olivier Bournez,et al.  Approximate Reachability Analysis of Piecewise-Linear Dynamical Systems , 2000, HSCC.

[58]  G. V. Puskorius,et al.  Truncated backpropagation through time and Kalman filter training for neurocontrol , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[59]  Donald Goldfarb,et al.  A numerically stable dual method for solving strictly convex quadratic programs , 1983, Math. Program..

[60]  Frank Allgöwer,et al.  Learning-Based Robust Model Predictive Control with State-Dependent Uncertainty , 2018 .

[61]  Matthias W. Seeger,et al.  Deep State Space Models for Time Series Forecasting , 2018, NeurIPS.

[62]  Joshua B. Tenenbaum,et al.  End-to-End Differentiable Physics for Learning and Control , 2018, NeurIPS.

[63]  Jonas Degrave A Differentiable Physics Engine for Deep Learning , 2016 .

[64]  Alberto Bemporad,et al.  A survey on explicit model predictive control , 2009 .

[65]  J. Lofberg,et al.  YALMIP : a toolbox for modeling and optimization in MATLAB , 2004, 2004 IEEE International Conference on Robotics and Automation (IEEE Cat. No.04CH37508).

[66]  Johan Efberg,et al.  YALMIP : A toolbox for modeling and optimization in MATLAB , 2004 .

[67]  J. Löfberg Minimax approaches to robust model predictive control , 2003 .

[68]  Wolfgang Dahmen,et al.  Introduction to Model Based Optimization of Chemical Processes on Moving Horizons , 2001 .

[69]  J. Rossiter,et al.  Robust predictive control using tight sets of predicted states , 2000 .

[70]  M. Morari,et al.  Explicit solution of LP-based model predictive control , 2000 .