Deep Forward-Backward SDEs for Min-max Control

This paper presents a novel approach to numerically solve stochastic differential games for nonlinear systems. The proposed approach relies on the nonlinear Feynman-Kac theorem that establishes a connection between parabolic deterministic partial differential equations and forward-backward stochastic differential equations. Using this theorem the Hamilton-Jacobi-Isaacs partial differential equation associated with differential games is represented by a system of forward-backward stochastic differential equations. Numerical solution of the aforementioned system of stochastic differential equations is performed using importance sampling and a neural network with Long Short-Term Memory and Fully Connected layers. The resulting algorithm is tested on two example systems in simulation and compared against the standard risk neutral stochastic optimal control formulations.

[1]  Evangelos Theodorou,et al.  Stochastic optimal control via forward and backward stochastic differential equations and importance sampling , 2018, Autom..

[2]  Evangelos Theodorou,et al.  Stochastic L1-optimal control via forward and backward sampling , 2018, Syst. Control. Lett..

[3]  H. Kappen Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.

[4]  Mo Chen,et al.  Hamilton-Jacobi reachability: A brief overview and recent advances , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[5]  W. Fleming Exit probabilities and optimal stochastic control , 1977 .

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Evangelos Theodorou,et al.  Neural Network Architectures for Stochastic Control using the Nonlinear Feynman-Kac Lemma , 2019, ArXiv.

[8]  W. Fleming,et al.  Controlled Markov processes and viscosity solutions , 1992 .

[9]  Evangelos A. Theodorou,et al.  Learning Deep Stochastic Optimal Control Policies Using Forward-Backward SDEs , 2019, Robotics: Science and Systems.

[10]  Wei Sun,et al.  Game Theoretic continuous time Differential Dynamic Programming , 2015, 2015 American Control Conference (ACC).

[11]  Panagiotis Tsiotras,et al.  Stochastic Differential Games: A Sampling Approach via FBSDEs , 2018, Dynamic Games and Applications.

[12]  Heba talla Mohamed Nabil Elkholy Dynamic modeling and control of a Quadrotor using linear and nonlinear approaches , 2014 .

[13]  Jun Morimoto,et al.  Minimax differential dynamic programming: application to a biped walking robot , 2003, SICE 2003 Annual Conference (IEEE Cat. No.03TH8734).

[14]  G. Parisi Brownian motion , 2005, Nature.

[15]  Jun Morimoto,et al.  Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.

[16]  E Weinan,et al.  Deep Learning Approximation for Stochastic Control Problems , 2016, ArXiv.

[17]  Jürgen Schmidhuber,et al.  LSTM can Solve Hard Long Time Lag Problems , 1996, NIPS.

[18]  M. James Controlled markov processes and viscosity solutions , 1994 .

[19]  HAROLD J. KUSHNER,et al.  Numerical Approximations for Stochastic Differential Games , 2002, SIAM J. Control. Optim..

[20]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[21]  I. V. Girsanov On Transforming a Certain Class of Stochastic Processes by Absolutely Continuous Substitution of Measures , 1960 .

[22]  S. Shreve Stochastic Calculus for Finance II: Continuous-Time Models , 2010 .

[23]  Evangelos Theodorou,et al.  Game-theoretic and risk-sensitive stochastic optimal control via forward and backward stochastic differential equations , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[24]  S. Peng,et al.  Backward Stochastic Differential Equations in Finance , 1997 .

[25]  Evangelos Theodorou,et al.  Learning optimal control via forward and backward stochastic differential equations , 2015, 2016 American Control Conference (ACC).

[26]  W. Fleming,et al.  Risk-Sensitive Control on an Infinite Time Horizon , 1995 .

[27]  Ioannis Exarchos Stochastic optimal control - a forward and backward sampling approach , 2017 .

[28]  X. Zhou,et al.  Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .