Optimal Control of Partially Observable Piecewise Deterministic Markov Processes

In this paper we consider a control problem for a Partially Observable Piecewise Deterministic Markov Process of the following type: After the jump of the process the controller receives a noisy signal about the state and the aim is to control the process continuously in time in such a way that the expected discounted cost of the system is minimized. We solve this optimization problem by reducing it to a discrete-time Markov Decision Process. This includes the derivation of a filter for the unobservable state. Imposing sufficient continuity and compactness assumptions we are able to prove the existence of optimal policies and show that the value function satisfies a fixed point equation. A generic application is given to illustrate the results.

[1]  Nicole Bäuerle,et al.  Discounted Stochastic Fluid Programs , 2001, Math. Oper. Res..

[2]  M. Jacobsen Point Process Theory and Applications: Marked Point and Piecewise Deterministic Processes , 2005 .

[3]  U. Rieder,et al.  Markov Decision Processes with Applications to Finance , 2011 .

[4]  Karen Gonzalez,et al.  Numerical method for optimal stopping of piecewise deterministic Markov processes , 2009, 0903.2114.

[5]  ANTHONY ALMUDEVAR,et al.  A Dynamic Programming Algorithm for the Optimal Control of Piecewise Deterministic Markov Processes , 2001, SIAM J. Control. Optim..

[6]  J. Lygeros,et al.  Stochastic hybrid modeling of DNA replication across a complete genome , 2008, Proceedings of the National Academy of Sciences.

[7]  Michael Z. Zgurovsky,et al.  Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities , 2014, Math. Oper. Res..

[8]  Wolfgang J. Runggaldier,et al.  Efficient Hedging When Asset Prices Follow A Geometric Poisson Process With Unknown Intensities , 2004, SIAM J. Control. Optim..

[9]  X. Jiang,et al.  Optimal Replacement Under Partial Observations , 2003, Math. Oper. Res..

[10]  K. Hinderer,et al.  Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter , 1970 .

[11]  Jane J. Ye,et al.  Necessary and sufficient optimality conditions for control of piecewise deterministic markov processes , 1992 .

[12]  Oswaldo Luiz do Valle Costa,et al.  Average Continuous Control of Piecewise Deterministic Markov Processes , 2005, CDC 2005.

[13]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[14]  Oswaldo Luiz do Valle Costa,et al.  Continuous Average Control of Piecewise Deterministic Markov Processes , 2013 .

[15]  Djalil Chafaï,et al.  On the long time behavior of the TCP window size process , 2008, ArXiv.

[16]  Dirk Klaus Lange,et al.  Cost optimal control of Piecewise Deterministic Markov Processes under partial observation , 2017 .

[17]  K. Pakdaman,et al.  Fluid limit theorems for stochastic hybrid systems with application to neuron models , 2010, Advances in Applied Probability.

[18]  O. Costa,et al.  Impulse and continuous control of piecewise deterministic Markov processes , 2000 .

[19]  F. Dufour,et al.  Optimal stopping for partially observed piecewise-deterministic Markov processes , 2012, 1207.2886.

[20]  Erhan Bayraktar,et al.  Inventory management with partially observed nonstationary demand , 2010, Ann. Oper. Res..

[21]  Manfred Schäl,et al.  Piecewise Deterministic Markov Control Processes with Feedback Controls and Unbounded Costs , 2004 .

[22]  Nicole Bäuerle,et al.  Partially Observable Risk-Sensitive Markov Decision Processes , 2015, Math. Oper. Res..

[23]  A. Yushkevich On Reducing a Jump Controllable Markov Model to a Model with Discrete Time , 1980 .

[24]  Mark H. Davis Markov Models and Optimization , 1995 .

[25]  Pierre Brémaud,et al.  Fourier Analysis and Stochastic Processes , 2014 .

[26]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[27]  Mark H. A. Davis Piecewise‐Deterministic Markov Processes: A General Class of Non‐Diffusion Stochastic Models , 1984 .

[28]  M. Schäl On piecewise deterministic Markov control processes: Control of jumps and of risk processes in insurance , 1998 .