Game-theoretic and risk-sensitive stochastic optimal control via forward and backward stochastic differential equations

In this work we present a sampling-based algorithm designed to solve game-theoretic control problems and risk-sensitive stochastic optimal control problems. The cornerstone of the proposed approach is the formulation of the problem in terms of forward and backward stochastic differential equations (FBSDEs). By means of a nonlinear version of the Feynman-Kac lemma, we obtain a probabilistic representation of the solution to the nonlinear Hamilton-Jacobi-Isaacs equation, expressed in the form of a decoupled system of FBSDEs. This system of FBSDEs can then be simulated by employing linear regression techniques. Utilizing the connection between stochastic differential games and risk-sensitive optimal control, we demonstrate that the proposed algorithm is also applicable to the latter class of problems. Simulation results validate the algorithm.

[1]  Francesca Da Lio,et al.  Uniqueness Results for Second-Order Bellman--Isaacs Equations under Quadratic Growth Assumptions and Applications , 2010, SIAM J. Control. Optim..

[2]  M. Kobylanski Backward stochastic differential equations and partial differential equations with quadratic growth , 2000 .

[3]  B. Bouchard,et al.  Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations , 2004 .

[4]  Wei Sun,et al.  Game Theoretic continuous time Differential Dynamic Programming , 2015, 2015 American Control Conference (ACC).

[5]  Avner Friedman,et al.  Stochastic differential games , 1972 .

[6]  Hai-ping Shi Backward stochastic differential equations in finance , 2010 .

[7]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[8]  J. Chassagneux,et al.  Numerical simulation of quadratic BSDEs , 2013, 1307.5741.

[9]  W. Fleming,et al.  Risk-Sensitive Control on an Infinite Time Horizon , 1995 .

[10]  C. Atkeson,et al.  Minimax differential dynamic programming: application to a biped walking robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[11]  J. Ma,et al.  Forward-Backward Stochastic Differential Equations and their Applications , 2007 .

[12]  Robert Denk,et al.  A forward scheme for backward SDEs , 2007 .

[13]  Adam Krzyzak,et al.  A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.

[14]  W. Fleming,et al.  Controlled Markov processes and viscosity solutions , 1992 .

[15]  Harold J. Kushner,et al.  On stochastic differential games: Sufficient conditions that a given strategy be a saddle point, and numerical procedures for the solution of the game☆ , 1969 .

[16]  Francis A. Longstaff,et al.  Valuing American Options by Simulation: A Simple Least-Squares Approach , 2001 .

[17]  HAROLD J. KUSHNER,et al.  Numerical Approximations for Stochastic Differential Games , 2002, SIAM J. Control. Optim..

[18]  P. Whittle Risk-sensitive linear/quadratic/gaussian control , 1981, Advances in Applied Probability.

[19]  J. Lepeltier,et al.  Existence for BSDE with superlinear–quadratic coefficient , 1998 .

[20]  R. Isaacs Differential games: a mathematical theory with applications to warfare and pursuit , 1999 .

[21]  Ying Hu,et al.  On the uniqueness of solutions to quadratic BSDEs with convex generators and unbounded terminal conditions , 2009, 0906.0752.

[22]  Ioannis Karatzas,et al.  Brownian Motion and Stochastic Calculus , 1987 .

[23]  P. Kloeden,et al.  Numerical Solution of Stochastic Differential Equations , 1992 .

[24]  X. Zhou,et al.  Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .

[25]  Rhodes,et al.  Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .

[26]  E. Gobet,et al.  Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations , 2006 .

[27]  Evangelos Theodorou,et al.  Learning optimal control via forward and backward stochastic differential equations , 2015, 2016 American Control Conference (ACC).

[28]  Chris P. Tsokos,et al.  Stochastic Differential Games. Theory and Applications , 2012 .

[29]  Jun Morimoto,et al.  Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.

[30]  Wolfgang J. Runggaldier,et al.  Connections between stochastic control and dynamic games , 1996, Math. Control. Signals Syst..

[31]  Robert F. Stengel,et al.  Optimal Control and Estimation , 1994 .