Iterative Path Integral Approach to Nonlinear Stochastic Optimal Control Under Compound Poisson Noise

Nonlinear stochastic optimal control theory has played an important role in many fields. In this theory, uncertainties of dynamics have usually been represented by Brownian motion, which is Gaussian white noise. However, there are many stochastic phenomena whose probability density has a long tail, which suggests the necessity to study the effect of non‐Gaussianity. This paper employs Lévy processes, which cause outliers with a significantly higher probability than Brownian motion, to describe such uncertainties. In general, the optimal control law is obtained by solving the Hamilton–Jacobi–Bellman equation. This paper shows that the path‐integral approach combined with the policy iteration method is efficiently applicable to solve the Hamilton–Jacobi–Bellman equation in the Lévy problem setting. Finally, numerical simulations illustrate the usefulness of this method.

[1]  Katsuhisa Ohno,et al.  Computing Optimal Policies for Controlled Tandem Queueing Systems , 1987, Oper. Res..

[2]  J. L. Aravena,et al.  Power system fault detection and state estimation using Kalman filter with hypothesis testing , 1991 .

[3]  W. Fleming,et al.  Controlled Markov processes and viscosity solutions , 1992 .

[4]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[5]  J. J. Westman,et al.  The LQGP problem: a manufacturing application , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).

[6]  Huy En Pham Optimal Stopping of Controlled Jump Diiusion Processes: a Viscosity Solution Approach , 1998 .

[7]  X. Zhou,et al.  Stochastic Controls: Hamiltonian Systems and HJB Equations , 1999 .

[8]  Benjamin Van Roy Neuro-Dynamic Programming: Overview and Recent Trends , 2002 .

[9]  W. Fleming,et al.  Stochastic Optimal Control, International Finance and Debt , 2002, SSRN Electronic Journal.

[10]  B. Øksendal,et al.  Sufficient Stochastic Maximum Principle for the Optimal Control of Jump Diffusions and Applications to Finance , 2004 .

[11]  H. Kappen Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.

[12]  R. Eiichiro An importance sampling method based on the density transformation of Lévy processes , 2008 .

[13]  Xiongzhi Chen Brownian Motion and Stochastic Calculus , 2008 .

[14]  Huyen Pham,et al.  Continuous-time stochastic control and optimization with financial applications / Huyen Pham , 2009 .

[15]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[16]  Raphael N. Markellos,et al.  A jump diffusion model for VIX volatility options and futures , 2010 .

[17]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[18]  Quanxin Zhu,et al.  Stability analysis for stochastic Volterra–Levin equations with Poisson jumps: Fixed point approach , 2011 .

[19]  Evangelos Theodorou,et al.  Stochastic optimal control for nonlinear markov jump diffusion processes , 2012, 2012 American Control Conference (ACC).

[20]  Quanxin Zhu,et al.  pTH Moment Exponential Stability of Stochastic Partial Differential Equations with Poisson Jumps , 2014 .

[21]  Quanxin Zhu Asymptotic stability in the pth moment for stochastic differential equations with Lévy noise , 2014 .

[22]  H. Kappen,et al.  Path integral control and state-dependent feedback. , 2014, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Hilbert J. Kappen,et al.  Adaptive Importance Sampling for Control and Inference , 2015, ArXiv.

[24]  Kenji Kashima,et al.  Path integral approach to stochastic optimal control under non-Gaussian white noise , 2016 .

[25]  Hilbert J. Kappen,et al.  An Iterative Method for Nonlinear Stochastic Optimal Control Based on Path Integrals , 2017, IEEE Transactions on Automatic Control.

[26]  Sep Thijssen,et al.  Consistent Adaptive Multiple Importance Sampling and Controlled Diffusions , 2018, 1803.07966.