论文信息 - Safe Optimal Control Using Stochastic Barrier Functions and Deep Forward-Backward SDEs

Safe Optimal Control Using Stochastic Barrier Functions and Deep Forward-Backward SDEs

This paper introduces a new formulation for stochastic optimal control and stochastic dynamic optimization that ensures safety with respect to state and control constraints. The proposed methodology brings together concepts such as Forward-Backward Stochastic Differential Equations, Stochastic Barrier Functions, Differentiable Convex Optimization and Deep Learning. Using the aforementioned concepts, a Neural Network architecture is designed for safe trajectory optimization in which learning can be performed in an end-to-end fashion. Simulations are performed on three systems to show the efficacy of the proposed methodology.

Evangelos A. Theodorou | Ziyi Wang | Ioannis Exarchos | Marcus Aloysius Pereira

[1] G. Parisi. Brownian motion , 2005, Nature.

[2] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[3] Arnulf Jentzen,et al. Solving high-dimensional partial differential equations using deep learning , 2017, Proceedings of the National Academy of Sciences.

[4] J. Zico Kolter,et al. OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[5] Koushil Sreenath,et al. Exponential Control Barrier Functions for enforcing high relative-degree safety-critical constraints , 2016, 2016 American Control Conference (ACC).

[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7] Stephen P. Boyd,et al. CVXGEN: a code generator for embedded convex optimization , 2011, Optimization and Engineering.

[8] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[9] X. Zhou,et al. Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .

[10] Paulo Tabuada,et al. Control barrier function based quadratic programs with application to adaptive cruise control , 2014, 53rd IEEE Conference on Decision and Control.

[11] Andrew Clark,et al. Control barrier functions for stochastic systems , 2020, Autom..

[12] Matthias Althoff,et al. Reachable set computation for uncertain time-varying linear systems , 2011, HSCC '11.

[13] Richard M. Murray,et al. Verifying Cyber-Physical Interactions in Safety-Critical Systems , 2013, IEEE Security & Privacy.

[14] Paulo Tabuada,et al. Control Barrier Functions: Theory and Applications , 2019, 2019 18th European Control Conference (ECC).

[15] Evangelos A. Theodorou,et al. Deep 2FBSDEs for Systems with Control Multiplicative Noise , 2019, ArXiv.

[16] Evangelos A. Theodorou,et al. Deep Forward-Backward SDEs for Min-max Control , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[17] Paulo Tabuada,et al. Control Barrier Function Based Quadratic Programs for Safety Critical Systems , 2016, IEEE Transactions on Automatic Control.

[18] S. Shreve. Stochastic Calculus for Finance II: Continuous-Time Models , 2010 .

[19] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..

[20] Evangelos A. Theodorou,et al. Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs , 2019, Robotics: Science and Systems.

[21] Evangelos Theodorou,et al. Stochastic optimal control via forward and backward stochastic differential equations and importance sampling , 2018, Autom..

[22] Ziyi Wang,et al. Feynman-Kac Neural Network Architectures for Stochastic Control Using Second-Order FBSDE Theory , 2020, L4DC.

[23] Andrew Clark,et al. Control Barrier Functions for Complete and Incomplete Information Stochastic Systems , 2019, 2019 American Control Conference (ACC).

[24] C. Karen Liu,et al. Differential dynamic programming with nonlinear constraints , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[25] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.