Deep learning method for solving stochastic optimal control problem via stochastic maximum principle

In this paper, we aim to solve the stochastic optimal control problem via deep learning. Through the stochastic maximum principle and its corresponding Hamiltonian system, we propose a framework in which the original control problem is reformulated as a new one. This new stochastic optimal control problem has a quadratic loss function at the terminal time which provides an easier way to build a neural network structure. But the cost is that we must deal with an additional maximum condition. Some numerical examples such as the linear quadratic (LQ) stochastic optimal control problem and the calculation of G-expectation have been studied.

[1]  M. L. Chambers The Mathematical Theory of Optimal Processes , 1965 .

[2]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[3]  E Weinan,et al.  Deep Learning Approximation for Stochastic Control Problems , 2016, ArXiv.

[4]  H. Pham,et al.  Deep neural networks algorithms for stochastic control problems on finite horizon, part I: convergence analysis , 2020 .

[5]  John N. Tsitsiklis,et al.  Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Hongjie Dong,et al.  The Rate of Convergence of Finite-Difference Approximations for Parabolic Bellman Equations with Lipschitz Coefficients in Cylindrical Domains , 2007 .

[8]  S. Peng,et al.  Fully Coupled Forward-Backward Stochastic Differential Equations and Applications to Optimal Control , 1999 .

[9]  Long Chen,et al.  Maximum Principle Based Algorithms for Deep Learning , 2017, J. Mach. Learn. Res..

[10]  Jiequn Han,et al.  Convergence of the deep BSDE method for coupled FBSDEs , 2018, Probability, Uncertainty and Quantitative Risk.

[11]  H. Kushner Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .

[12]  Richard Bellman,et al.  Dynamic Programming and Stochastic Control Processes , 1958, Inf. Control..

[13]  X. Zhou,et al.  Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .

[14]  S. Peng Nonlinear Expectations and Stochastic Calculus under Uncertainty , 2010, Probability Theory and Stochastic Modelling.

[15]  S. Peng,et al.  Fully Coupled Forward-Backward Stochastic Differential Equations and Applications to Optimal Control , 1999 .

[16]  Evangelos Theodorou,et al.  Neural Network Architectures for Stochastic Control using the Nonlinear Feynman-Kac Lemma , 2019, ArXiv.

[17]  Shige Peng,et al.  Problem of eigenvalues of stochastic Hamiltonian systems with boundary conditions , 2000 .

[18]  J. Ma,et al.  Forward-Backward Stochastic Differential Equations and their Applications , 2007 .

[19]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[20]  N. V. Krylov The Rate of Convergence of Finite-Difference Approximations for Bellman Equations with Lipschitz Coefficients , 2004 .

[21]  Shaolin Ji,et al.  A Global Stochastic Maximum Principle for Fully Coupled Forward-Backward Stochastic Systems , 2018, SIAM J. Control. Optim..

[22]  Maziar Raissi,et al.  Forward-Backward Stochastic Neural Networks: Deep Learning of High-dimensional Partial Differential Equations , 2018, ArXiv.

[23]  Ying Peng,et al.  Three Algorithms for Solving High-Dimensional Fully Coupled FBSDEs Through Deep Learning , 2019, IEEE Intelligent Systems.

[24]  E Weinan,et al.  Deep Learning-Based Numerical Methods for High-Dimensional Parabolic Partial Differential Equations and Backward Stochastic Differential Equations , 2017, Communications in Mathematics and Statistics.

[25]  S. Peng A general stochastic maximum principle for optimal control problems , 1990 .

[26]  Espen R. Jakobsen,et al.  ON THE RATE OF CONVERGENCE OF APPROXIMATION SCHEMES FOR BELLMAN EQUATIONS ASSOCIATED WITH OPTIMAL STOPPING TIME PROBLEMS , 2003 .