论文信息 - Deep Forward-Backward SDEs for Min-max Control - 字舞流文

Deep Forward-Backward SDEs for Min-max Control

This paper presents a novel approach to numerically solve stochastic differential games for nonlinear systems. The proposed approach relies on the nonlinear Feynman-Kac theorem that establishes a connection between parabolic deterministic partial differential equations and forward-backward stochastic differential equations. Using this theorem the Hamilton-Jacobi-Isaacs partial differential equation associated with differential games is represented by a system of forward-backward stochastic differential equations. Numerical solution of the aforementioned system of stochastic differential equations is performed using importance sampling and a neural network with Long Short-Term Memory and Fully Connected layers. The resulting algorithm is tested on two example systems in simulation and compared against the standard risk neutral stochastic optimal control formulations.

Evangelos A. Theodorou | Ziyi Wang | Ioannis Exarchos | Keuntaek Lee | Marcus A. Pereira | Evangelos A. Theodorou | Ioannis Exarchos | Keuntaek Lee | Ziyi Wang | M. Pereira

[1] Evangelos Theodorou,et al. Stochastic optimal control via forward and backward stochastic differential equations and importance sampling , 2018, Autom..

[2] Evangelos Theodorou,et al. Stochastic L1-optimal control via forward and backward sampling , 2018, Syst. Control. Lett..

[3] H. Kappen. Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.

[4] Mo Chen,et al. Hamilton-Jacobi reachability: A brief overview and recent advances , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[5] W. Fleming. Exit probabilities and optimal stochastic control , 1977 .

[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7] Evangelos Theodorou,et al. Neural Network Architectures for Stochastic Control using the Nonlinear Feynman-Kac Lemma , 2019, ArXiv.

[8] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .

[9] Evangelos A. Theodorou,et al. Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs , 2019, Robotics: Science and Systems.

[10] Wei Sun,et al. Game Theoretic continuous time Differential Dynamic Programming , 2015, 2015 American Control Conference (ACC).

[11] Panagiotis Tsiotras,et al. Stochastic Differential Games: A Sampling Approach via FBSDEs , 2018, Dynamic Games and Applications.

[12] Heba talla Mohamed Nabil Elkholy. Dynamic modeling and control of a Quadrotor using linear and nonlinear approaches , 2014 .

[13] Jun Morimoto,et al. Minimax differential dynamic programming: application to a biped walking robot , 2003, SICE 2003 Annual Conference (IEEE Cat. No.03TH8734).

[14] G. Parisi. Brownian motion , 2005, Nature.

[15] Jun Morimoto,et al. Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.

[16] E Weinan,et al. Deep Learning Approximation for Stochastic Control Problems , 2016, ArXiv.

[17] Jürgen Schmidhuber,et al. LSTM can Solve Hard Long Time Lag Problems , 1996, NIPS.

[18] M. James. Controlled markov processes and viscosity solutions , 1994 .

[19] HAROLD J. KUSHNER,et al. Numerical Approximations for Stochastic Differential Games , 2002, SIAM J. Control. Optim..

[20] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[21] I. V. Girsanov. On Transforming a Certain Class of Stochastic Processes by Absolutely Continuous Substitution of Measures , 1960 .

[22] S. Shreve. Stochastic Calculus for Finance II: Continuous-Time Models , 2010 .

[23] Evangelos Theodorou,et al. Game-theoretic and risk-sensitive stochastic optimal control via forward and backward stochastic differential equations , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[24] S. Peng,et al. Backward Stochastic Differential Equations in Finance , 1997 .

[25] Evangelos Theodorou,et al. Learning optimal control via forward and backward stochastic differential equations , 2015, 2016 American Control Conference (ACC).

[26] W. Fleming,et al. Risk-Sensitive Control on an Infinite Time Horizon , 1995 .

[27] Ioannis Exarchos. Stochastic optimal control - a forward and backward sampling approach , 2017 .

[28] X. Zhou,et al. Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .